> 


wi 


aon tm 
ad 
Undergraduate Lecture Notesin: Physics _ s. © 
ym s 
E 4. T 


o aaam y 


Albrecht Lindner aE. Sten 
Dieter Strauch = 5 sF AA 4 4 


A Complete 
Course on 
Theoretical 
Physics 


From Classical Mechanics to Advanced 
Quantum Statistics 


A Springer 


Undergraduate Lecture Notes in Physics 


Series editors 


Neil Ashby, University of Colorado, Boulder, CO, USA 

William Brantley, Department of Physics, Furman University, Greenville, SC, USA 
Matthew Deady, Physics Program, Bard College, Annandale-on-Hudson, NY, USA 
Michael Fowler, Department of Physics, University of Virginia, Charlottesville, 
VA, USA 

Morten Hjorth-Jensen, Department of Physics, University of Oslo, Oslo, Norway 


Undergraduate Lecture Notes in Physics (ULNP) publishes authoritative texts covering 
topics throughout pure and applied physics. Each title in the series is suitable as a basis for 
undergraduate instruction, typically containing practice problems, worked examples, chapter 
summaries, and suggestions for further reading. 


ULNP titles must provide at least one of the following: 


e An exceptionally clear and concise treatment of a standard undergraduate subject. 
e A solid undergraduate-level introduction to a graduate, advanced, or non-standard subject. 
e A novel perspective or an unusual approach to teaching a subject. 


ULNP especially encourages new, original, and idiosyncratic approaches to physics teaching 
at the undergraduate level. 


The purpose of ULNP is to provide intriguing, absorbing books that will continue to be the 
reader’s preferred reference throughout their academic career. 


More information about this series at http://www.springer.com/series/89 17 


Albrecht Lindner - Dieter Strauch 


A Complete Course 
on Theoretical Physics 


From Classical Mechanics to Advanced 
Quantum Statistics 


G Springer 


Albrecht Lindner Dieter Strauch 

Pinneberg, Germany Theoretical Physics 
University of Regensburg 
Regensburg, Germany 


ISSN 2192-4791 ISSN 2192-4805 (electronic) 
Undergraduate Lecture Notes in Physics 
ISBN 978-3-030-04359-9 ISBN 978-3-030-04360-5 (eBook) 


https://doi.org/10.1007/978-3-030-04360-5 
Library of Congress Control Number: 2018961698 


The original, German edition was published in 2011 under the title “Grundkurs Theoretische Physik”. 
© Springer Nature Switzerland AG 2018 

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part 
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, 
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission 
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar 
methodology now known or hereafter developed. 

The use of general descriptive names, registered names, trademarks, service marks, etc. in this 
publication does not imply, even in the absence of a specific statement, that such names are exempt from 
the relevant protective laws and regulations and therefore free for general use. 

The publisher, the authors and the editors are safe to assume that the advice and information in this 
book are believed to be true and accurate at the date of publication. Neither the publisher nor the 
authors or the editors give a warranty, express or implied, with respect to the material contained herein or 
for any errors or omissions that may have been made. The publisher remains neutral with regard to 
jurisdictional claims in published maps and institutional affiliations. 


This Springer imprint is published by the registered company Springer Nature Switzerland AG 
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland 


In memory of Albrecht Lindner (1935-2005), 
scientist, teacher, friend 


Preface 


This textbook is a translation of the third German edition of Grundkurs 
Theoretische Physik (A Basic Course on Theoretical Physics), originally published 
by Teubner, Stuttgart, Germany. Actually, this edition is much more than a typical 
textbook since it offers a mixture of basic and advanced material of all of the 
fundamental disciplines of theoretical physics in one volume, whence it may well 
serve also as a reference book. The large number of cross-references will guide the 
reader from the basic experimental observations to the construction of a “unified” 
theory, and the present compactness should ensure that the reader does not get lost 
along the way. 

A wide range of problems invite the reader to tackle further applications at 
various stages of sophistication, and a list of textbooks offers the way forward to 
possible open questions. 

The material itself and the way it is presented is due to the late Albrecht Lindner. 
My contribution is restricted merely to the translation into the English language; in 
fact, my sincerest gratitude goes to Dr. Steven Lyle who corrected the translation in 
manly places; whatever remains of insufficient vocabulary or grammar is due to my 
limited mastery of the language. The only changes I have made are to adjust to the 
publisher’s requirements, made some changes in the numerical tables as to be 
expected from May 2019 on, and adapt the list of textbooks to an English-speaking 
readership. 

I am proud, nevertheless, to present this book to the English-speaking 
community. 


Regensburg, Germany Dieter Strauch 


Vii 


Preface to the First German Edition 


Like the standard course in theoretical physics, the present book introduces the 
physics of particles under the heading Classical Mechanics, the physics of fields 
under Electromagnetism, quantum physics under Quantum Mechanics I, and sta- 
tistical physics under Thermodynamics and Statistics. Besides these branches, 
which would form a curriculum for all students of physics, there is a complement 
entitled Quantum Mechanics II, for those who wish to obtain a deeper under- 
standing of the theory, which discusses scattering problems, quantization of fields, 
and Dirac theory (as an example of relativistic quantum mechanics). 

The goal here is to stress the interrelations between the individual subjects. In an 
introductory chapter, there is a summary of the most important parts mathematical 
tools repeatedly needed in the different branches of physics. These constitute the 
mathematical foundation for rationalizing our practical experience, since we wish 
to describe our observations as precisely as possible. 

The selection of material was mainly inspired by our local physics diploma 
curriculum. Only in a few places did I go beyond those limits, e.g., in Sect. 4.6 
(quantum theory and dissipation), Sect. 5.2 (three-body scattering), and Sect. 5.4 
(quasi-particles, quantum optics), since I have the impression that the essentials can 
also be worked out rather easily in these areas. 

Section 5.5 on the Dirac equation also differs from the standard presentation, 
because I prefer the Weyl representation over the standard representation—despite 
my intention to avoid any special representation as far as possible. In this respect, I 
am grateful to my colleagues Till Anders (Munich), Dietmar Kolb (Kassel), und 
Gernot Miinster (Minster) for their valuable comments on my drafts. 

Thanks go also to numerous students in Hamburg and especially to Dr. Heino 
Freese and Dr. Adolf Kitz for many questions and suggestions, and various forms 
of support. The general interest in my notes encourages me to present these now to 
a larger community. 

(Notes on figure production are left out here—D.S.) 


Hamburg, Germany Albrecht Lindner 
Fall 1993 


Preface to the Second German Edition 


The text has been improved at many places, in particular in Sects. 3.5 and 5.4, and 
all figures have been inserted with pstricks. In addition, three-dimensional objects 
now appear in central instead of of parallel perspetive. 


Hamburg, Germany Albrecht Lindner 
Summer 1996 


xi 


Preface to the Third German Edition 


The Basic Course (Grundkurs) was discovered in a third, extensively revised 
edition, after Albrecht Lindner, a passionate teacher, unexpectedly passed away. As 
one of those rare textbooks which presents a complete curriculum of theoretical 
physics in a single volume—compact and simultaneously profound—it should be 
offered to the teacher and student community. In the present third edition the 
material has been revised in many places, and the number of figures has been 
approximately doubled. Also in this edition is an additional chapter containing 
numerous problems. 

My contribution here is restricted to adjusting the material to the changed 
appearance required by the Teubner publishing company. 


Regensburg, Germany Dieter Strauch 
Spring 2011 


xiii 


Contents 


1 Basics GE ExpericMe icy. has Hh ke ned hake oS SKE CREASE RS 
Ll. NGG AMY 625 444544284¢4¢ 9S 94 ERE ARE HERE ERE RES 
LLI  Spacewnd Lie. oboe Cheb ena koe ee a eee es 
LI? Veetor Algebra goes bes oR 4 SR INGER RES SS 
US “Tae a a E Cheese oh PER PR wR SRS 
LLI Vector Fields «occu a ceeds heckaedacaeea iiid dhadi 
LLS Gradient Slope Density) «30466864 seercdebewen gu 
1.1.6 Divergence (Source Density) oscari gage h ap osassa 
LL? Cwl (Vortex Density. t06 cb eed o554 0 eeGe kenks 
1.1.8 Rewriting Products. Laplace Operator............... 
1.1.9 Integral Theorems for Vector Expressions ............ 
LLIO Delta Finctiom: 64 nc 2 ok ec bak eskia yet dk A eg ee 
LIIL Fourier Transom o<2cc04 545 65554359 26es ook Ses 
1.1.12 Calculation of a Vector Field from Its Sources 
Sri CMMI. 2.08 ad webs E E E as ewes E E oes 
11,13 Vector Fields at Interfaces: ..4.05¢.202s406%080 0045 
L2 Coordonadas 2 kc ete cd ebe dats ghee eeus age bee seaeans 
1.2.1 Orthogonal Transformations and Euler Angles......... 
1.2.2 General Coordinates and Their Base Vectors.......... 
1.2.3 Coordinate Transformations ....................4. 
1:24 The Concept of a Tensor... cc. eek da gokeys cane es 
1.2.5 Gradient, Divergence, and Rotation in General 
EEE E nub kee ee Ani dé big E dS eee Be 
1.2.6 Tensor Extension, Christoffel Symbols .............. 
1.2.7 Reformulation of Partial Differential Quotients......... 
1.3 Measurements and Brom 4254655545 8654 < 28484 eh re EDR 
L320 Intteducnon. <6 i565 hid cee Gee heb ewe EERE SH 
1.3.2 Mean Value and Average Error .. 2... 2 ee 
1.3.3 Error Distribution. 6444056244 5004808 i dkh Eri 


XV 


xvi Contents 


1.3.4 Error Propagation... .... SPERIA k aE TEA 49 
1.3.5 Finite Measurement Series and Their dives Error. . 50 
LIO Bror AMA Shes soa ec ood s wale ese gas aes 50 
L37 | Method. of Least Sguares se crcceirterrsesrsevense SM 
List of Symbols 
Suggestions for Further Reading ................-2-02--+0-+-. 54 


2 Classical Mechanics .............. 0000000 cece eee ee 55 
2.1 Basic Concepts. . (itp RELAD RHA RERRE SERS Ke eee ae -A 
2.1.1 and Counter-Fo ë 

312 

213 


2.1.4 
ALS 


2.1.6 The Kepler Problem okt 62 
2.1.7 Summary: Basic Concepts :ss.scsrissssisssssssse 68 
2.2 Newtonian Mec! 
2.2.1 Force- Pies Motion . ee Te ee ee ee ee rm | 
cae CenterofMass Theorem >... iii 70 


2.2.9 Miad Bodies . ee ee ee ee 

2.2.10 Mom FOF BURMA. aereto tanin Sk boa hiked 86 

2.2.11 Principal Axis Transformation ... 
Acehas Reference Frames 


23 oa Mechanics . . ` A canes eed era . O OS 


2.3.1 D’Alembert’s Principle. . ee ee eee Cee a ee.” | 


Velocity-Dependent Forces and Friction .. . l istiare, BF 
Conserved Quantities. Canonical and Mechanical 
Momentum <. 44 cca csde es esedigucesideravenae 99 


Contents xvii 


129 
130 


tions to 2 Moving Telia Panes, 
Perturbation hauk 133 
vobi 135 
139 
142 
À IC M eae 144 
2.4.11 Cuenca Tensions of Au 
Ponen aoeettentverd xen LAY 
152 
153 
160 
161 
162 


; 163 
Guaview ò TANA İSM aoaaa caxeee 163 
oan s Law—Far or Near Action? .............. 165 
Electrostatic Potential . EIEE Sarees ie oe 
isplacement F Field (pane wes oan ET 174 
178 


Energy of the Blectrostatic i TERTE E NE 
3.1.9 Maxwell Stress Tensor in Sean, ieescece IR 
3.1.10 Summary: Diemont assise irera taaan Aaaa 185 

3.2 Stationary Currents and Magnetostatics ........ on.ona annaa 


323 Locente Tana NE AE S E AS OA AET) 


xviii 


Contents 


Magnetic Momento: o53 gees 644s bar esoe ib NiE 190 
‘cannes ee ene oe eee re ee eee ee 


3.3 “Charge C Coana ad Maxwell’s Displacement 

Faray Induction Law and Lenz’s Rule. apada tirinis O 
Maxwell's Equations srci ¢205 0445 h00s40¢ isinira 206 
Time-Dependent Potentials ..4.46.02isse40ce0c0e« 208 
Poynting’s Theo 
nity A 


ieir kadi s Equations . repete Rte kantia AA 
34 Lorenz Inyañanee cage 2 cides dungeon deed Pau Sh eR HERS eA 227 
3.4.1 Velocity of Light in Vacuum ..................... 227 
3.4.2 Lorentz Transformation ................0-00-0005 228 
3.4.3 Four-Vectors . 
3.44 E 
3.4.5 
3.4.6 
3.4.7 i ) 
3.4.8 Tra formation Bansi 
3.4.9 Relativistic Dynamics of Fite Particle . 
3.4.10 Relativistic Dynamics with External Forces........... 247 
3.4.11 Energy—Momentum Stress Tensor.................. 248 
3.4.12 Summary: Lorentz Invariance ....... sreruaag 249 
3.4.13 oe Hamiltonian Saverio fea Fields , EREE E 250 
3.5 nepal aik, Erap ia ee fer ere ae 
ee 
0.2 
53.5 
3.5.4 
3.5.6 Radiation Power for r Dipole, jiin and na 
Radisiot. ns s6acas nav ddea Cet ceeerewen ge 202 


Contents XIX 


3.5.7 Summary: Radiation Fields..................-.0-5 266 
WiGbienie4 oud 4 gee hee POOR oi Gh eek hee e Leeks dang x BOF 
DB 2 oe OG eae heehee LA RRS ees A 2a 
References . 5 rbshedeseieees an AA 
Suggestions | = Teribooks aa ita oo Pett haeeh ee sewed 274 


4 ge T eS Meise eee eeheeeeaanens “ae 

4.1 275 
RS ie 
276 


Sis 
Pae ee H i Se \ EE T wens 284 
Improper Hilbert Vectors . . see tuhdanscinnbaw aN] 
ine Wave-Particle Dualism Sh BS Rb dole toh AAF 
4.2.1 ‘ina wd Anti inear cdo ee er Sere 288 
4.2.2 Matrix Elements and Representation of Lier 

Ope era vnisvinetiotiretedesd eee eee 290 


4.2 


vables. ‘Basis Apiti: ct 
eben) bite dh toi ik Vale E E E ede bo wale’ 


Hsu Oon and Wave-Particle Dustin. eee. OE 
Doublets and Pauli Operators ................0000- 308 
pha siete Pure States and Mis, 


4.3 


Wave Fur ctions . ieteh Eh eee ee aio eee aR 
Wigner Function <2... 2... Oe ere ee ee 321 
SA s.c2 adden ET EMR SR RES Oh aEE Dak Sa 
Correspondence Paiticlese EE tab BAe Sole TETE 325 


4.3.8 


4.3.10 Co 


Contents 


Angular Momentum Operator.................-00. 
Spherical Harmonics .... EINE E ENEE E 
g of Angular Monen, OPEREERT AAAS peki 


4.3.11 Summary: Correspondence Principle................ 
44 Time ae a idence 


44.1 
4.4.2 


4.4.4 
4.4.5 


4.4.6 epen 
4.5 Time-Independent Schrödi E 


4.5.1 


Time ae 
Time-Depend t ] 
Current D 


envalue Equation for th : . 
Reduci n to Ordinary Differential Equations . beat 
e Box Potential ................ 


Panabation Theory . 
Variational Method . TOLEN 


Coupling to the Sener, 

Markov Approximati 
Deriving the Rate Equ 
Rate Equation for coun 
Mu season PENR A d 


Quantum Mechanics IT.............0.000 00000 ccc eee 
Sl SOO TONY bier ee ees Noes eo Reese Lae oe 


5.1.1 
512 
5.1.3 


Be na Green Fu 1 dons S (Propagos) 
and Resolvents. . ee ree ee eee Ges 


328 
331 
335 
337 
338 
338 


340 
342 
345 
348 
350 
351 
351 
352 
353 
358 
361 
368 
370 
371 
373 
374 
374 
377 
379 
382 


384 
386 
389 
389 
395 


| 396 
ions for Texthodies and Poriser Readius... ee ee REYS 


397 
399 
399 
399 


403 


403 


405 


406 


Contents xxi 


5.1.6 Representations of the Resolvents 
and the Interactions 444.4605. ¢e064 berdseraveresx 408 
517 Li a sa i 
5.1.8 Moller’s bean — 
5.1.9 Scattering ; 
5.1.10 The ' 
SAL § 
5.1.12 Su 
5.2 Two- and Three-Body Scattering F 
5.2.1. Two-Potential Bonnnata of Gell- ren 
and Gold nger ETT Ideer aaen E NeR 


tive Hamilton biroe in ‘the Feshbach Theory . 
) actions and Resonances .............. 


434 
436 


Serene Dna ited Meee SS 
— and Anti 


440 


Generel Properties of Creation and a iaoea 
S o ORE O EEEE A E Oboes Hee bey. MN 
The Two-Body System as an E apis TO ESE ETE 443 
Representation of One-Particle Operators............. 445 
Representation of Two-Body Operators.............. 446 
Time Dependence sccsicorercrsririkesetirenio MR 
Wave-Particle Dualism ................... fio toqct AO 
|2 Summary: Many-Body Systems ................... 451 
I ak Hs fideo be eh eB ech Seok inteak Meares A Shep ine abl: 4 Blois gees. ad 
454 


5.4 


oaa... 457 


55 


5.6 


Contents 


5.4.5 Hartree-Fock—Bogoliubov Equations. . 
5.4.6 Hole States ee 


5.4.7 ee ee 


Photons . 
Sal Miepie. de the Guia of Rlestemagieje 


Fields ......... Sedge ksh Bll a T E E EEE 


Quantization of Photons, 
—— : S 


Dise WPAN coer newheeets EERTE a EEA 
a 6. 1 Relativistic Invariance eree Bets eben a 


5.6.4 epike of the 

5.6.5 Behavior of the D Equation 1 l 

Tafa -oo 
] Cova iants 

5.6.7 Space id Time Berri, and Chege 

Consan ooo weed aka Had aa 

Dirac Sar AAS m DE EQuanOn saccsrotss 


aa: ET AP IEE A EAA R E EE VEE 


Suggestions | tor ‘Textbooks and Fu ther 


Thermodynamics and Statistics. . 


6.1 


6.2 


UG ol gk ipa Wd bokeh Reh ek eee 
6.1.1 Introduc 
6.1.2 ta 
6.1.3 
6.1.4 
6.1.5 
6.1.6 
6.1.7 
h ll, a 


é. 2. i 
6.2.2 


APMMONUMANON 24454 ceed eoe R84 es iaaa 


458 
462 
462 
463 


463 


466 


470 


473 


_ 476 


479 
482 
487 
487 
487 
488 


490 
492 


494 


498 
501 
504 
509 
509 
510 
511 
513 
513 
513 
515 
516 
518 
520 
521 
523 
525 
535 


527 


Contents 


6.4 


6.5 


ñi 


6.2.3 Liouville and Collision-Free Boltzmann Equation ...... 529 
6.2.4 Boltzmann Equation ae eae ees 531 
2 o of the F 


533 


539 


oe cal Ensemble ee ee ee ee ve ee ere 549 


cae Equilibria . Peer 
Tempe ture, wins 


" The Basic Relation of T rmodynamics E is 
eae — and Heat 


Thermodynamical P Potentials ages Le g sone 
Transformations . 4 


6.4.7 


6.4.8 E thalpy and Free Energy 4 as State Variables. reres Set 
6.4.9 Deven Alte 
6.4.10 S 
Results e 
6.5.1 
6.5.2 
6.5.3 
6.5.4 
6.5.5 
6.5.6 


eorems of T EN serris AN 
e Style Patil Model ct aise k tartsa asa ait 


id of 
Ideal Gases ... 


xxiv 


Contents 


6.5.7 Electromagnetic Radiation in a Cavity............... 594 
6.5.8 Lattice Vibrations........ 
6.5.9 Summary: Results for the 


ngle- Pade Madel . ESOT 599 


6.6.2 ing the van der Waals Buuation sx» 601 
6.6.3 Critical Behavior ......... eke pes) aie eee rition UE 
664 Paramagnetisitl, 2.1.4 cciccaeetnes aera ee dene ne 
66.5  Ferroitiagnetistti ..4.5 6.05 cnc canna cnr esas eaansns 607 
6.6.6 Bose-Einstein Co: 


6.6.7 Summary: Phase Transitions...................... 6L 
Problems Been nee eae EPN EEE RLE OREN E TOTE P bee 612 


pant for Textbooks and Further Reading. . eer ee 620 


Appendix A: Important Constants.............. PE EFENA 623 


Index .. 


Chapter 1 A) 
Basics of Experience get 


1.1 Vector Analysis 


1.1.1 Space and Time 


Space and time are two basic concepts which, according to Kant, inherently or 
innately determine the form of all experience in an a priori manner, thereby making 
possible experience as such: only in space and time can we arrange our sensations. 
[According to the doctrines of evolutionary cognition, what is innate to us has devel- 
oped phylogenetically by adaption to our environment. This is why we only notice 
the insufficiency of these “self-evident” concepts under extraordinary circumstances, 
e.g., for velocities close to that of light (co) or actions of the order of Planck’s quan- 
tum h. We shall tackle such “weird” cases later—in electromagnetism and quantum 
mechanics. For the time being, we want to make sure we can handle our familiar 
environment. ] 

To do this, we introduce a continuous parameter t. Like every other physical 
quantity it is composed of number and unit (for example, a second 1 s = 1 min/60 
= 1 h/3600). The larger the unit, the smaller the number. Physical quantities do not 
depend on the unit—likewise equations between physical quantities. Nevertheless, 
the opposite is sometimes seen, as in: “We choose units such that the velocity of light 
c assumes the value 1”. In fact, the concept of velocity is thereby changed, so that 
instead of the velocity v, the ratio v/c is taken here as the velocity, and ct as time or 
x/c as length. 

The zero time (t = 0) can be chosen arbitrarily, since basically only the time 
difference, i.e., the duration of a process, is important. A differentiation with respect to 
time (d/dt) is often marked by a dot over the differentiated quantity, i.e., dx/dt = x. 

In empty space every direction is equivalent. Here, too, we may choose the zero 
point freely and, starting from this point, determine the position of other points in 
a coordinate-free notation by the position vector r, which fixes the distance and 
direction of the point under consideration. This coordinate-free type of notation is 


© Springer Nature Switzerland AG 2018 1 
A. Lindner and D. Strauch, A Complete Course 

on Theoretical Physics, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-04360-5_1 


2 1 Basics of Experience 


particularly advantageous when we want to exploit the assumed homogeneity of 
space. However, conditions often arise (i.e., when there is axial or spherical sym- 
metry) which are best taken care of in special coordinates. We are free to choose a 
coordinate system. We only require that it determine all positions uniquely. This we 
shall treat in the next section. 

Besides the position vector r, there are other quantities in physics with both 
value and direction, e.g., the velocity v = r, the acceleration a = v, the momentum 
p = mv, and the force F = p. The appropriate means to handle such quantities is 
vector algebra, with which we shall be extensively concerned in this section. This 
method allows us to encompass both the value and the direction of the quantities 
under consideration much better than using components, which, moreover, depend 
on the coordinate system. 

For the time being—namely for plane and three-dimensional problems—we 
understand a vector as a quantity with value and direction, which can be repre- 
sented as an arrow of corresponding length. (Generally, vectors are mathematical 
entities, which can be added together or multiplied by a number, with the usual rules 
of calculation being valid.) Sometimes they are denoted by a letter with an arrow 
atop. The value (the length) of a is denoted by a or |a |. 


1.1.2 Vector Algebra 


From two vectors a and b, their sum a + b may be formed according to the con- 
struction of parallelograms (as the diagonal), as shown in Fig. 1.1. From this follows 
the commutative and associative law of vector addition: 


a+b=b+a, (a+b)+c=a+(b+ec). 
The product of the vectors a with a scalar (i.e., directionless) factor œ is understood 
as the vector œ a = aq with the same (for a < 0 opposite) direction and with value 


|æ] a. In particular, a and —a have the same value, but opposite directions. Fora = 0 
the zero vector 0 results, with length 0 and undetermined direction. 


b a+ 5 


—b a—b 


Fig. 1.1 Sum and difference of vectors a and b. The vectors may be shifted in parallel, e.g., a—b 
can also lie on the dashed straight line 


1.1 Vector Analysis 3 


> 


e-d 


Fig. 1.2 Scalar and vector products: e - a is the component of a in the direction of the unit vector 
e, and |a x b | is the area shown 


The scalar product (inner product) a - b of the two vectors a and b is the product 
of their values times the cosine of the enclosed angle ġa, (see Fig. 1.2 left): 


a-b=ab cos dap . 


The dot between the two factors is important for the scalar product—if it is missing, 
then itis the tensor product of the two vectors, which will be explained in Sect. 1.2.4— 
with a - b c Æ a b - c, ifa and c have different directions, i.e., if a is not a multiple 
of c . Consequently, one has 

a-b=b.a 


and 
a-b=0 41> alb o a=0 or b=0. 


If the two vectors are oriented perpendicularly to each other (a -L b), then they are 

also said to be orthogonal. Obviously, a - a = a? holds. Vectors with value 1 are 

called unit vectors. Here they are denoted by e. Given three Cartesian, i.e., pairwise 

perpendicular unit vectors e,, €y, €z, all vectors can be decomposed in terms of these: 
a = ex Ay + ey dy +e; Az , 


with the Cartesian components 


dy = &x:a, ay =ey-a, a, =e,-a. 


Z z 


Here the components will usually be written after the unit vectors. This is particularly 
useful in quantum mechanics, but also meaningful otherwise, since the coefficients 
depend on the expansion basis. Since for a given basis a is fixed by its three compo- 
nents (ay, dy, az), ais thus often given as this row vector, or as a column vector, with 
the components written one below the other. However, the coordinate-free notation 
a is in most cases more appropriate to formal calculations, e.g., a + b combines the 
three expressions a, + bx, ay + by, and a, + b,. Because e, -e, = 1, e-e, = 0 
(and cyclic permutations e, - e, = 1, e, -e, = 0 and so on), one clearly has 


a-b=a, by +a, by +a, bz- 


Hence it also follows thata- (b+ c)=a-b+a-c. 


4 1 Basics of Experience 
The vector product (outer product) a x b of the two vectors a and b is another 
vector which is oriented perpendicularly to both and which forms with them a right- 
hand screw, like the thumb, forefinger, and middle finger of the right hand. Its value 
is equal to the area of the parallelogram spanned by a and b (see Fig. 1.2 right): 
lax b|=ab sin ġab . 
Hence it also follows that 


axb=-—bxa, ax(b+e)=axb+axc, 


and 
axb=0 1> alb or a=0 o b=0. 


Using a right-handed Cartesian coordinate system, we have 
e X ey =e (and cyclic permutations e, x e, = €x, ...), 
and also e, x e, = 0, etc., whence 
a x b= e, (a, bz — az by) + ey (a, by — ay bz) + € (ay by — ay by) . 


This implies 
ax (bxc)=(cx b) xa=be-a-ca-b. 


(This decomposition also follows without calculation because the product depends 
linearly upon its three factors, lies in the plane spanned by b and c, vanishes for 
b œx c, and points in the direction of b fore = a L b.) According to the last equation, 
every vector a can be decomposed into its component along a unit vector e and its 
component perpendicular to it: 
a=ee-a—ex(exXa). 
In addition, it satisfies the Jacobi identity (note the cyclic permutation) 
ax (bxec)+bx (ec xa)+ex(axb)=0. 
The scalar product of a vector with a vector product, viz., 
a-(bxc)=b-(cxa)=c:-(axb), 
is called the (scalar) triple product of the three vectors. It is positive or negative, if 


a, b, and c form a right- or left-handed triad, respectively. Its value gives the volume 
of the parallelepiped with edges a, b, and c. In particular, e, - (ey x ez) = 1. 


1.1 Vector Analysis 5 


In this context, the concept of a matrix is useful. An M x N matrix A is understood 
as an entity made of M x N “matrix elements”, arranged in M rows and N columns: 


An (ie {l,..., M} kel, ..., NJ, eg., 
Ai Aiz A13 “a Ai Anı A31 
A = | An An A3 => A = | A2 An A32 
A31 A32 A33 Aj3 A23 A33 


The transposed matrix A just introduced has elements An = A;;, hence N rows and 
M columns. We shall mainly be concerned with square matrices, which have equal 
numbers of rows and columns, i.e., M = N. The matrix product of A and B is 


N 
C=AB_ with Ca =} AB, 
j=l 
which is, of course, defined only if the number of columns of A is the same as the 
number of rows of B. We have AB = B A. 


If we now combine the 3x3 Cartesian components of the vectors a, b, and c in 
the form of a matrix, its determinant 


dy Ay az 
by by bz | = ax (bycz — bzcy) + ay (bzCx — byez) + az (by cy — bycx) 
Cy Cy Cz 


= day (by; = bzcy) + by (cyaz = czay) + Cx (ayb; a azby) 
is equal to the triple product a - (b x c). For determinants, we have 
detA=detA and det(AB) = detA x detB. 


Therefore, also 


a-fa-ga-h 
a- (bxc) f-(xh)=/b-fb-gb-h 
c-fc-gc-h 


Moreover, from (a x b) -c = a - (b x c) and replacing ¢ by c x d, it follows that 


a-ca-d 
b-cb-d 


(a x b)- (c x d) = (a-c)(b- d) — (a- d)(b- ce) = 


’ 


the determinant of a 2x2 matrix, and in particular, 
(a x b)- (ax b) = ab? — (a - b)? , 


which, of course, follows from sin?¢,, = 1 — cos? Qa. 


6 1 Basics of Experience 


Table 1.1 Space-inversion 
behavior 


Original image Mirror image 


Polar vector I d 
Axial vector -îÎ-— -t- 


It is not allowed to divide by vectors—neither scalar products nor vector products 
can be decomposed uniquely in terms of their factors, as can be seen from the 
examples a- b = 0 anda x b= 0. 

In the context of the vector product, we have to consider the fact that only in 
three-dimensional space can a third vector be assigned uniquely as a vector normal 
to two vectors. Otherwise a perpendicular direction cannot be fixed uniquely, and 
no direction can be given in the sense of the right-hand rule. In fact, in Sect. 3.4.3, 
in order to extend the three-dimensional space to the four-dimensional space-time 
continuum of the theory of special relativity, we change from the vector product to 
a skew-symmetric matrix (or a tensor of second rank) which, in three-dimensional 
space, has three independent elements, just like every vector. 

Actually, we also have to distinguish between polar vectors (like the position 
vector r and the velocity v = r) and axial vectors (e.g., the vector product of two 
polar vectors), because they behave differently under a space inversion (with respect 
to the origin): the direction of a polar vector is reversed, while the direction of an 
axial vector is preserved. Correspondingly the triple product of three polar vectors is 
a pseudo-scalar, because it changes its sign under space inversion. Axial vectors can 
actually be viewed as rotation axes with sense of rotation and not as arrows—they 
are pseudo-vectors (Table 1.1). 

Inversion involves a special change of coordinates: it cannot be composed of 
infinitesimal transformations, like rotations and translations. General properties of 
coordinate transformations will be treated in the next section. Until then we will 
thus assume only right-handed Cartesian coordinate systems with e, x e, = e, (and 
cyclic permutations). 


1.1.3 Trajectories 


If a vector depends upon a parameter, then we speak of a vector function. The vector 
function a (t) is continuous at fo, if it tends to a (to) for t > to. With the same limit 
t — fo, the vector differential da and the first derivative da/dt is introduced. These 
quantities may be formed for every Cartesian component, and we have 


d(a + b) = da + db , d(aa) =ada+ada , 
d(a-b) =a-db+b-da, d(a x b) =a x db — b x da. 


Obviously, a - da/dt = $d(a-a)/dt = $da*/dt = a da/dt holds. In particular the 
derivative of a unit vector is always perpendicular to the original vector—if it does 
not vanish. 


1.1 Vector Analysis 7 


As an example of a vector function, we investigate r (t), the path of a point as 
a function of the time t. Thus we want to consider also the velocity v = r and the 
acceleration a = Fr rather generally. The time is not important for the trajectories as 
geometrical lines. Therefore, instead of the time t we introduce the path length s as 
a parameter and exploit ds = |dr | = v dt. 

We now take three mutually perpendicular unit vectors er, en, and eg, which are 
attached to every point on the trajectory. Here er has the direction of v: 


tangent vector er = — = 


For a straight path, this vector is already sufficient for the description. But in general 
the 


path curvature k = 


is different from zero. In order to get more insight into this parameter we consider a 
plane curve of constant curvature, namely, the circle with s = R gy. For r (g) = ro + 
R (cos o e, + sin ọ ey), we have k = |d°r/d(Rọ)?| = R7!. Instead of the curvature 
kK, its reciprocal, the 


curvature radius R= 


can also be used to determine the curve. Hence as a further unit vector we have the 


l vect pape 
normal vector en = = Fi 
N ds ds? 


Since it has the direction of the derivative of the unit vector er, itis perpendicular to er. 
Now we may express the velocity and the accelerations because èr = (der/ds) v = 
(v/R) en as follows: 


2 
> z 5 vV 
vV=r=ver,, ASTE KETE p ENS 


Thus there is a tangential acceleration a - er = ar = v, if the value of the veloc- 
ity changes, and a normal acceleration a- ex = ay = v?°/R, if the direction of the 
velocity changes. From this decomposition we can also see why motions are often 
investigated either along a straight line or along a uniformly traveled circle—then 
only ar or only ay appears. 

If the curve leaves the plane spanned by er and ey, then the 


binormal vector ep = €r X en 


also changes with s. Because dey/ds = xen, its derivative with respect to s is equal 
to er x den/ds. This expression (perpendicular to er) must be proportional to en, 


8 1 Basics of Experience 


because derivatives of unit vectors do not have components in their direction. Since 
en = ep X er, besides 
der be ae deg dey 
—=ken, thederivatives —-=-—tey and —— =T €g —K er 
ds ds ds 


appear with the torsion T, also called the winding or second curvature. For a right- 
hand thread, one has t > 0, and for a left-hand thread, t < 0. The relation 


a dr dry dY 
T=R (= x —) _— 
ds ds?/ ds? 
also holds, because of t = eg - (den/ds) and eg = er x en. (Here it is unimportant 
for the winding whether the curvature depends upon s.) 
With the Darboux vector 
6=Kxegt+ter, 


the expressions just obtained for the derivatives of the three unit vectors with respect 
to the curve length s (Frenet—Serret formulas) can be combined to yield 


de, 
ds 


=dxe with e. € {er, en, €B}. 


As long as neither the first nor the second curvature changes along the curve, the Dar- 
boux vector is constant: dk /ds = 0 = dt/ds = > dô/ds = 0, because k deg /ds = 
—t der/ds. The curve winds around it. An example will follow in Sect. 2.2.5, 
namely the spiral curve of a charged particle in a homogeneous magnetic field: 
in this case the Darboux vector is ô = —qB/(mv). The curves with constant ô thus 
depend upon the initial velocity vp. Among these are also circular orbits (perpen- 
dicular to ô) and straight lines (along +ô), where admittedly a straight line has 
vanishing curvature («x = 0), and the concept of the second curvature (winding) 
thus has no meaning. The quantities ô and vo yield the winding t = ô - Vo/vo and 
curvature «x (> 0) because of 6* = «<? + 17. The radius h and the helix angle a 
(with |a| < ir) of the associated thread follow from h = « /8? and a = arctan t/k. 
[With r = ro + h (cos ye, + singe, + tang ye,) and s cosa = hg and because 
of tana = t/x, the scalar triple product expression for t yields the equation 
cos? æ = h/R.] The geometrical meaning of the curvature radius R and radius h 
is thus the reciprocal of the length of the Darboux vector (see Fig. 1.3). 

If the curve traveled is given by the functions y(x) and z(x) in Cartesian coordi- 
nates, then we have 


dr d Cj- d (- =) dx 
ds? ds \ds/ dx \dx ds/ ds ’ 


and because ds? = dx? + dy? + dz’, we also have dx/ds = 1/,/1 + y’? + z’? with 
y’ = dy/dx and z’ = dz/dx. Hence, the square of the path curvature is given by 


1.1 Vector Analysis 9 


ër 


Fig. 1.3 Spiral curve around the constant Darboux vector ô oriented to the right (constant curvature 
and winding, here with « = T). Shown are also the tangent and binormal vectors of the moving 
frame and the tangential circle. Not shown is the normal vector en = eg x er, which points toward 
the symmetry axis 


2 B (y'z" -_ yz!) + y”? 4 z"? 
~ el +y? 4 2/2)3 


and the torsion by 
Wei WwW 1 


r= yz =y Zz 
AS 


For the curvature, we have «x > 0, while t is negative for a left-hand thread. 


1.1.4 Vector Fields 


If a vector is associated with each position, we speak of a vector field. With scalar 
fields, a scalar is associated with each position. The vector field a (r) is only contin- 
uous at ro if all paths approaching rọ have the same limit. For scalar fields, this is 
already an essentially stronger requirement than in one dimension. 

Instead of drawing a vector field with arrows at many positions, it is often visu- 
alized by a set of field lines: at every point of a field line the tangent points in the 
direction of the vector field. Thus a || dr anda x dr = 0. 

For a given vector field many integrals can be formed. In particular, we often 
have to evaluate integrals over surfaces or volumes. In order to avoid double or triple 
integral symbols, the corresponding differential is often written immediately after 
the integral symbol: dV for the volume, df for the surface integral, e.g., f df x a 
instead of — f a x df (in this way the unnecessary minus sign is avoided for the 
introduction of the curl density or rotation on p. 13). Here df is perpendicular to the 
related surface element. However, the sign of df still has to be fixed. In general, we 
consider the surface of a volume V, which will be denoted here by (V). Then df 
points outwards. Corresponding to (V), the edge of an area A is denoted by (A). 

An important example of a scalar integral is the line integral f dr - a (r) along 
a given curve r (t). If the parameter t determines the points on the curve uniquely, 
then the line integral 


10 1 Basics of Experience 


d sja t 
foram | tz ar) 


is an ordinary integral over the scalar product a - dr/dt. Another example of a scalar 
integral is the surface integral f df - a (r) taken over a given area A or over the surface 
(V) of the volume V. 

Besides the scalar integrals, vectorial integrals like f dV a, f df x a,and f dr x a 
can arise, e.g., the x-component of f dV a is the simple integral f dV ay. 

Different forms are also reasonable through differentiation: vector fields can be 
deduced from scalar fields, and scalar fields (but also vector fields and tensor fields) 
from vector fields. These will now be considered one by one. Then the operator V 
will always turn up. The symbol V, an upside-down A, resembles an Ancient Greek 
harp and hence is called nabla, after W. R. Hamilton (see 122). 


1.1.5 Gradient (Slope Density) 


The gradient of a scalar function y(r) is the vector field 
gady = Vy, with Vy -dr = dy = y(r + dr)-— y(r). 


This is clearly perpendicular to the area y = const. at every point and points in 
the direction of dy > 0 (see Fig. 1.4). The value of the vector Vw is equal to the 
derivative of the scalar function y (r) with respect to the line element in this direction. 
In Cartesian coordinates, we thus have 


ð ð ə ð ð a 
Vy =e, Lares Cpg Lofe + ey + Jv. 
x dy Oz x ð 


ə 


Fig. 1.4 Gradient V w of a scalar field ọ (r) represented by arrows. Contour lines with constant y 
are drawn as continuous red and field lines (slope lines) of the gradient field as dashed blue. In the 
example considered here, both families of curves contain only hyperbolas (and their asymptotes) 


1.1 Vector Analysis 11 


Here ðy/ðx is the partial derivative of w(x, y, z) with respect to x for constant 
y and z. (If other quantities are kept fixed instead, then special rules have to be 
considered, something we shall deal with in Sect. 1.2.7.) 

The gradient is also obtained as a limit of a vectorial integral: 


1 
Vy = lim — df ; 
v= lim z; J, Y) 


If we take a cube with infinitesimal edges dx, dy, and dz, we have on the right-hand 
side as x-component (dx dy dz)~'{dy dz w(x + dx, y, z) — dy dz W(x, y, z)} = 
dw/dx, and similarly for the remaining components. Hence, also 


[owvvw- df wv, 
V (V) 


because a finite volume can be divided into infinitesimal volume elements, and for 
continuous w, contributions from adjacent planes cancel in pairs. With this surface 
integral the gradient can be determined even if y is not differentiable (singular) at 
individual points—the surface integral depends only upon points in the neighbour- 
hood of the singular point, where everything is continuous. (In Sect. 1.1.12, we shall 
consider the example y = 1/r.) 

Corresponding to dy = (dr - V) y, we shall also write in the following 


ə ə 
da = (dr - V) a = dx © + dy Z + az =. 
ox ay Oz 


We also attribute a meaning to the operation V a, but notice that there is no scalar 
product between V and a (rather it is the dyadic or tensor product, as shown in the 
next section), but there is a scalar product between dr and V. Then for a Taylor 
series, we may write 


wr + dr) = Wr) + (dr- V)y + $ (dr: Vt, 


where all derivatives are to be taken at the position r. 


1.1.6 Divergence (Source Density) 


While a vector field has been derived from a scalar field with the help of the gradient, 
the divergence associates a scalar field with a vector field: 


1 
div a = V -a = lim f df-a. 
Vee V Jv 


12 1 Basics of Experience 


For the same cube as in the last section, the right-hand expression yields 


[dy dz {ax (x+dx, ys z) —a, (x, y, z)} 


dx dy dz 
+dz dx {ay(x, y+dy, z) —ay(x, y, z)} 
da, ðay ða 
+dx dy {a,(x, y, z+dz) —a,(x, y, D} = +2 +, 
Ox dy Oz 


as suggested by the notation V - a, i.e., a scalar product between the vector operator 
V and the vector a. With this we have also proven Gauss’s theorem 


fovv-a=f df -a, 
Vv (V) 


since for any partition of the finite volume V into infinitesimal ones and for a contin- 
uous vector field a, the contributions of adjacent planes cancel in pairs. The integrals 
here may even enclose points at which a (r) is singular (see Fig. 1.5 left). We shall 
discuss this in more detail in Sect. 1.1.12. 

The integral f df - a over an area is called the flux of the vector field a (r) through 
this area (even if a is not a current density). In this picture, the integral over the closed 
area (V) describes the source strength of the vector field, i.e., how much more flows 
into V than out. The divergence is therefore to be understood as a source density. 
A vector field is said to be source-free if its divergence vanishes everywhere. (If the 
source density is negative, then “drains” predominate.) 

The concept of a field-line tube is also useful (we discussed field lines in 
Sect. 1.1.4). Its walls are everywhere parallel to a (r). Therefore, there is no flux 
through the walls, and the flux through the end faces is equal to the volume integral 
of V - a. For a source-free vector field (V - a = 0), the flux flowing into the field-line 
tube through one end face emerges again from the other. 


Fig. 1.5 Fields between coaxial walls. On the /eft and in the center, the walls are drawn as con- 
tinuous lines and the field lines as dashed lines. On the left, the field is curl-free and has sources 
on the walls, while in the center it is source-free and has curls on the wall, if in both cases the field 
strength |a| = a decays with increasing distance R from the axis as shown in the right-hand graph, 
i.e., in such a way that aR is constant 


1.1 Vector Analysis 13 


1.1.7 Curl (Vortex Density) 


The curl (rotation) of the vector field a (r) is the vector field 


1 
rota=Vxa= lim f df xa. 
v0 V) 


For the above-mentioned cube with the edges dx, dy, dz, the x-component of the 
right-hand expression is equal to 


1 
Te dy dg [+82 dx feels, y + dy, 2) — are, ys 2} 
dx dy {ay ( + dz) ( H da, _ day 
—dx ay (x, y, — a(x, y, ane OO a Oi 
yay Rees Z y Yy, z ay a: 
With 0; = 1/dx;, we thus have 
a ð da, 0 a day ex €y €z 
Tamel = a =a) a = 5 )= Ox Dy de), 
y Z Z Xx x y Gaga 


which is the vector product of the operators V and a. This explains the notation 
V x a. Moreover, we have 


fava] df x a 
v (V) 


for all continuous vector fields, although they may become singular point-wise, and 
even along lines, as will become apparent shortly. 
An important result is Stokes’s theorem 


fwa f dr-a, 
A (A) 


where df is taken in the rotational sense on the edge (A) and forms a right-hand screw. 
The right-hand side is the rotation (curl) of a, that is, the line integral of a along 
the edge of A. In order to get an insight into the theorem, consider an infinitesimal 
rectangle in the yz-plane. On the left, we have 


ða ða 
df -(V = | dydz ( = - —), 
f (ea fo«(F a 


and on the right 


14 1 Basics of Experience 


i ar-a= f ayay,- f dyas, y, +da 
(A) 


+ fiaa y+dy d- f dza, yo. 


The first two integrals on the right-hand side together result in — f dy (day/0z) dz, 
the last two in f dz (da,/dy) dy. This implies 


da, day 
J ar-a = f ay az ($E - =). 
(A) dy Oz 


The theorem holds thus for an infinitesimal area. A finite area can be divided into 
sufficiently small ones, where adjacent lines do not contribute, since the integration 
paths from adjacent areas are opposite to each other. 

According to Stokes’s theorem we may also set 


1 
ea (V xa) = lim = f dr-a, 
gí (A) 


where the unit vector e4 is perpendicular to the area A and dr forms a right-hand 
screw with e4. The curl density V x a can be introduced more pictorially with this 
equation than with the one mentioned first, and even for vector fields which are 
singular along a line (perpendicular to the area). Therefore, the inner “conductor” in 
Fig. 1.5 may even be an arbitrarily thin “wire”. 

For V x a Æ 0, the vector field has a non-vanishing rotation, or vortex. If V x a 
vanishes everywhere, then the field is said to be curl-free (vortex-free). 


1.1.8 Rewriting Products. Laplace Operator 


Given various fields, the linear differential operators gradient, divergence, and rota- 
tion assign other fields to them. They have the following properties: 


Ve@w=oVY+V VO, 

V-Wwa=vWV-a+a-Vv, 
VxWwa=wWvVxa-axVy, 
V-(axb)=b-(V xa)—a-(V xb), 
V x (ax b) = (b. V) a—b(V-a)—(a-V) b+a(V-b), 

V (a-b) = (b- V) at+bx (Vx a)+(a-V) b+ax(V xb). 


All these equations can be proven by decomposing into Cartesian coordinates and 
using the product rule for derivatives. For the last three, however, it is better to refer 


1.1 Vector Analysis 15 


to Sect. 1.1.2 (and the product rule) and place V between the other two vectors, so 
that this operator then acts only on the last factor (see Problem 3.1). Since 


V-r=3, Vxr=0, (a-V)r=a 
(Problem 3.2), we find in particular 


V-(wr)=3y¥t+r-Vy, 
Vx(wry=-rx Vv, 
(a-V) Wr=aw-ri(a-Vy), 


and 


V-(axr)=r-(V xa), 
V x (axr)=2a+(r-V)a—r(V-a), 
V (a-r)=at+(r-V)at+rx (V xa). 


These equations are generally applicable and save us lengthy calculations—we shall 
use them often. Besides these, we also have 


not only for integer numbers n, but also for fractions. Furthermore, if y and a 
have continuous derivatives with respect to their coordinates, then the order of the 
derivatives may be interchanged, viz., 


VxVw=0 and V-(Vxa=0. 
Hence, gradient fields are curl-free (vortex-free), and curl fields are source-free. 
Point-like singularities do not alter these results. 
The operator A in the expression 


Av =V-Vw 


is called the Laplace operator. For a final reformulation, we make use once again of 
a result in Sect. 1.1.2, namely b-c a = c (b- a) — b x (c x a), whence 


Aa=V-Va=V(V-a)—-Vx(V xa). 


Therefore, this operator can act on scalars y(r) and vectors a (r). In Cartesian coor- 
dinates it reads in both cases 


16 1 Basics of Experience 


According to Gauss’s theorem, 


1 
1 at-vy= f av ay, thus awv=im z f df- Vy. 
v) v v>0 V Jv) 
The Laplace operator is thus to be understood as the limit of a surface integral. It 
is apparently only different from zero if V y changes on the surface (V). A further 
important relation is 


V: WVe—oVW) = Ab-¢ Ay, 


which can be derived from the above equations. 

According to Gauss’s theorem a source- and curl-free field has to vanish every- 
where, if it vanishes on the surface (“at infinity”). Every curl-free vector field can 
be represented as a gradient field Vw, where y obeys the Laplace equation Ay = 
0 everywhere, because the field is also taken to be source-free. Hence, we have 
V-wVw=Vv- V y, according to Gauss’s theorem fey) df-wVy=f,dVVy- 
Vw. The left-hand side has to be zero, and on the right the integrand is nowhere 
negative, whence it has to vanish everywhere. 


1.1.9 Integral Theorems for Vector Expressions 
The concepts gradient, divergence, and rotation follow from the equations 
/ dV Vw = df wv, 
v (V) 
f dV V -a= / df-a  (Gauss’s theorem), 
v (V) 


[ovvxa= | df xa. 
V (V) 


Dividing a finite volume into infinitesimal parts, the contributions of adjacent planes 
cancel in pairs. Corresponding to these, we found in Sect. 1.1.7 [the first expression 
is, of course, also equal to Sa (df x V)-a] 


/ df-(V xa) = / dr-a  (Stokes’s theorem), 
A (A) 


[axvy= drwy. 
A (A) 


1.1 Vector Analysis 17 


The last equation can be proven like Stokes’s theorem. Likewise, we may also derive 


the following equation: 
faxvxa= f drxa. 
A (A) 


If we take the area element df = e, dy dz once again, then using the vector product 

expansion on p. 4, the integrand on the left-hand side is equal to V (ex - a) — ex V - a. 

On the right, one has the same, namely, dz e; x (da/dy) dy — dye, x (da/0z) dz. 
In addition, since V - (wa) = y V -a +a- Vy Gauss’s theorem implies 


f dt-ya= | av 4 V-a+a: YY), 
(V) V 


(Here the left- and right-hand sides should be interchanged, i.e., the triple integral 
should be simplified to a double integral.) Hence, we deduce the first and second 
Green theorems 


/ at-yve= f av (Y Ap +Y: VY), 
(Vv) V 
[at ve-ovw = | av wag-oav). 
(V) V 
Taking w as the Cartesian component of a vector b, we may also infer 
(t-a b= fav (b(V-a)+(a-¥)b), 
(Vv) Vv 
Since b = r and (a - V) r = a, it also follows that 
[ova- (af -ayr— f avr V-a). 
v (V) v 


The volume integral over a source-free vector field a is thus always zero if a vanishes 
on the surface (V). 
Finally, we should mention the equation 


f d'xya= f dv Yxa-axvy), 
v) v 


where we have used V x (Wa) = WV xa—ax Vy. 


18 1 Basics of Experience 


1.1.10 Delta Function 


In the following, we shall often use the Dirac delta function. Therefore, its properties 
are compiled here, even though it does not actually belong to vector analysis, but to 
general analysis (and in particular to integral calculus). 

We start with the Kronecker symbol 


3, = fori £k, 
ik= |1 foi=k. 


It is useful for many purposes. In particular we may use it to filter out the k th element 


of a sequence { f; }: 
fe= 0 fi Six. 


Here, of course, within the sum, one of the 7 has to take the value k. Now, if we 
make the transition from the countable (discrete) variables i to a continuous quantity 
x, then we must also generalize the Kronecker symbol. This yields Dirac’s delta 
function 5(x — x’). It is defined by the equation 


b 
f@)= f f(x)ô(x— x’) dx fora < x’ <b, zero otherwise , 


a 


where f(x) is an arbitrary continuous test function. If the variable x (and hence also 
dx) is a physical quantity with unit [x], the delta function has the unit [x]7!. 

Obviously, the delta function 5(x — x’) is not an ordinary function, because it has 
to vanish for x Æ x’ and it has to be singular for x = x’, so that the integral becomes 
f 8(x — x’) dx = 1. Consequently, we have to extend the concept of a function: 
5(x — x’) is a distribution, or generalized function, which makes sense only as a 
weight factor in an integrand, while an ordinary function y = f(x) is a map x —> y. 
Every equation in which the delta function appears without an integral symbol is an 
equation between integrands: on both sides of the equation, the integral symbol and 
the test function have been left out. 

The delta function is the derivative of the Heaviside step function: 


0 forx <x’ 


1 forx > x’ rh oO) =e Os 


e(x—x')= 


At the discontinuity, the value of the step function is not usually fixed, although the 
mean value 1/2 is sometimes taken, whence it becomes point symmetric. The step 
function is often called the theta function and noted by @ (or ©) instead of e (con- 
trary to the IUPAP recommendation). The derivative of the step function vanishes for 
x Æx', while fi g'(x — x') dx = e(b — x')— e(a — x') is equal to one for 
a <x’ < b and zero for other values of x’. 


1.1 Vector Analysis 19 
Hence, using 
1 1. x 
é(x) = = + — lim arctan — , 
2 7 &>+0 E 
we find the important equations 


1 1 1 1 1 
bx) == lim =~ = — tim ( —) = - ( —),. 
T e>4+0x-+e 2mi e>4+0\x—-—ie x++ie 271 \x—io x+io 


We may thus represent the generalized function 6(x) as a limit of ordinary functions 
which are concentrated ever more sharply at only one position. According to the last 
equation it is practical here to decompose the delta function in the complex plane 
into two functions with the same pole for +io with opposite residues, then to take 
the limit o > +0. 

Clearly, we also have 


i i 1 1 

x +10 SAES S 2 acer ea) , 

if we make use of x 5(x) = 4i{(x + io)™! — (x — io)~!} for the second reformu- 
lation. Here, the expression in the last bracket vanishes for x? < 0’, while it turns 
into 2x /(x? + 07) ~ 2/x for x? >> o°. This can be exploited for the principal-value 
integral (the principal value) P..., a kind of opposite to the delta function, because 
it leaves out the singular position x’ in the integration, with equally small paths on 
either side of it: 


b d x'—e b d 
P Lod. im (f oe. 
x —x! e>+0 \Ja sial X — x! 


a 


Like the delta function, the symbol P also makes sense only in the context of an 
integral. Hence we may also write the equation above as 


1 P_.. 5 
xtio x Fens 
This result is obtained rather crudely here, because the infinitesimal quantity o is 
supposed to be arbitrarily small, but nevertheless different from zero. It can be proven 
using the residue theorem from the theory of complex functions. To this end, we 
consider 


T f(x) dx +f f (x) dx =(f +f)" 

Zæ X — (x! —io) Jig X — (x! +0) a IG? z=z ` 

with the two integrations running from left to right because of Cı (above) and C3 
(below the symmetry axis) in Fig. 1.6 for regular test functions f (z). In the complex 


20 1 Basics of Experience 


Fig. 1.6 Integration paths Cı and C2 (continuous lines) to determine the principal value and the 
residues. The (real) symmetry axis is shown by the dashed line 


z-plane the integrand only has the pole at z’ = x’ — io in the lower half-plane and at 
x’ + io in the upper half-plane, whence the indicated integrations can be performed. 

The difference between the two integrals is equal to — f f(x) (x — x’)! dx, 
according to the residue theorem, thus equal to —27i f(x’). In the sum of the two 
integrals the contributions from the half circles cancel, since for z = z’ + € exp(ig), 
we have dz = ie exp(id) dọ = i(z — z’) d@, and what remains is twice the princi- 
pal value, which is what was to be shown. Hence, we have proven our claim that 
(x io)! = Px! Fim (x). 

Since x d(x) = 0, the integrand may even be divided by functions which have 
Zeros: 


A B 
A=B «s 2a >a C3g. 
Xx Xx 


The constant C in the integrals can be fixed, provided that we also fix the integration 
path across the singularity (e.g., as for the principal value integral). 
An important property of the delta function is 


d(ax) = L d(x), 


la| 


because both sides are equal to de(y)/dy for y = ax. In particular, the delta function 
is even, i.e., 6(—x) = d(x). Hence we can even infer (ha b(x) dx = L, If instead of 
ax we take a function a(x) as argument, and if a(x) has only one-fold zeros x, then 


it follows that 5( ) 
X — Xn 
5(a(x)) = 2 oE 
and in particular also that 8(x? — xo?) = {8 (x — xo) + d(x + xo)}/(2lxol). 
In addition, ff f(x) 5(x — y) 6(y — x’) dx dy = f f(y) 8O — x’) dy = fa’) = 
J f(x) 8(x — x’) dx delivers the equation 


[oe-» 00-2) dy = ô(x — x’). 


This is similar to the defining equation of the delta function, in which we allowed 
only for ordinary, continuous functions as test functions. 

For the n th derivative of the delta function, n partial integrations (fora < x’ < b, 
zero otherwise) result in 


1.1 Vector Analysis 21 


b 
/ f(x) 6 — x’) dx = (-)" FM), 


because the limits do not contribute. It thus follows that x 5’(x) = —6(x), which we 
shall need in quantum theory (Sect. 4.3.2) for the real-space representation of the 
momentum operator, viz., P = (h/i) V. 

If, in the interval a < x < b, we have a complete orthonormal set of functions 
{g,(x)}, i.e., a series of functions with the properties 


b 
1 gn (X) Bw (x) dx = Spy’ 


as well as f(x) = >>, 8n(x) fn for all (square-integrable) functions f(x), then after 
interchange of summation and integration, we have f, = ie 8n* (x) f(x) dx for 


the expansion coefficients, and hence }-, is En (X) Bn (x) f(x) dx = f(x’), which 
leads to 


B(x — x") = YO gn E) gn’) - 


Each complete set of functions delivers a representation of the delta function, i.e., it 
can be expanded in terms of ordinary functions. 

In particular, we can expand the delta function in the interval —a < x < a 
in terms of a Fourier series: we have g,(x) = 1//2a exp(inxz/a) with n € 
{0, +1, +2, ...} and (the result is even in x — x’) 


1 i —x' 
ôa =)= yap nMan for —a<x<a. 
a a 


For a — oo, we can even go over to a Fourier integral. For very large a, the sequence 
kn = nr /a becomes nearly continuous. Therefore, we replace the sum }_,„ f (kn) Ak 
with Ak = z/a by its associated integral 


1 [0.6] 
6x—x) = =| exp{ik(x —x')}dk for — o0 <x < œ. 
20 J- 


For the Fourier expansion, we therefore take g(k, x) = 1/./27 exp(ikx). We now 
have the basics for the Fourier transform, which we shall discuss in the next section. 

The integral from —oo to +00 can be decomposed into the one from —oo to 
O plus the one from 0 to +00. But with k — —k, we have an exp (ikx) dk = 
i exp (—ikx) dk, so this part delivers the complex-conjugate of the other part. 
Therefore, we infer Re i exp (ikx) dk = x 6(x) or 


œ sinkx 


1°” 1 1 
d(x) = — coskx dk and e(x)==+ dk . 
m Jo 2 T Jo 


k 


22 1 Basics of Experience 


On the other hand, the usual integration rules for te exp (ikx) dk deliver the expres- 
sion (ix)7! exp (kx) => . For real x, this is undetermined for k —> oo. But if x 
contains an (even very small) positive imaginary part, then it vanishes for k —> oo. 
We include this small positive imaginary part of x as before through x + io (with 
real x): 


eee i P 
f exp(ikx) dk = ——— = x ô(x) +i — . 
0 x +10 x 


We have already proven this for the real part of the integral, because the real part 
of the right-hand side has turned out to be equal to x 6(x). But then the equation 
holds also for the imaginary part, because the proof used only general properties of 
integrals. 


1.1.11 Fourier Transform 


If the region of definition is infinite on both sides, we use 
[00] [06] 
f@=f ee fioak, f= fete» fonder, 
—oo —0o 


with g(k, x) = 1//2z exp(ikx): 


foe -=> | eC ETA 
fi) = = l _exp(ikx) fO) dë. 


Generally, f(x) and f(k) are different functions of their arguments, but we would 
like to distinguish them only through their argument. [The less symmetric notation 
f@) = f exp(ikx) F (k) dk with F (k) = f(k) /V20 is often used. This avoids the 
square root factor with the agreement that (27)~! always appears with dx.] Instead 
of the pair of variables x <> k, the pair t <> œ is also often used. 

Important properties of the Fourier transform are 


fx) = f* x) — fk) = f*(-k) > 
fœ) = gahx) > fk = z | s-or) dk’, 
f@) = e@—x) <> fK) = exp(-ikr’) gk) . 
For a periodic function f(x) = f(x — L) the last relation leads to the condition k, = 


2x n/l withn € {0, £1, +2, ...}, thus to a Fourier series instead of the integral. 
In addition, by Fourier transform, all convolution integrals f g(x — x’) h(x’) dx’ can 


1.1 Vector Analysis 23 


clearly be turned into products J20. g(k) h(k) (Problem 3.9), which are much easier 
to handle. 

If f (x) vanishes for all x < 0, then f(x) = e(x) f (x) holds with the step function 
mentioned in the last section, e.g., for “causal functions” f(t), which depend upon 
the time t. Then the Fourier transform yields the relation 


i y tO) 
foe rey = so=sirf af. 


Here, due to the factors i in the Fourier transformed f(k), the real and imaginary 

parts are related to each other in such a way that only the one or the other (for all 

k) needs to be measured. This relation is sometimes called the Kramers—Kronig or 

dispersion relation, even though it also actually exploits the fact that f(x) is real, 

whence the integration has to be performed over just half the region, viz., 0 to oo. 
Another result that is often useful is Parseval’s equation 


5 dx g*(x) h(x) = f dk g* (© h(k) . 


In order to prove it, we expand the left-hand side according to Fourier and obtain 
the integral (27)! fJ dx dk dk’ exp{i(k — k’)x} g*(k’) h(k). After integration over 
x, we encounter the delta function 27 6(k — k’) and can then also integrate easily 
over k', which yields the right-hand side. In particular, f dx | f (x)|? = J dk | f(k) z 

Table 1.2 shows some of the Fourier transforms commonly encountered. To prove 
the last relation in the table, we have to use a square addition in the exponent and the 
integral a exp (—x?) dx = 4/7, the latter following from 


[o0] [e6] 1 
il exp (=x? — y*)dr dy = 27 f ev dsm, with s =r? =x +y. 
ae 0 


Table 1.2 Some functions f(x) fk) 
and their Fourier transforms (cike’ 
exp (—ikx’) 
5(x — x’) eee 
20 
1 1 sin(ak) 
= 2_ x2 — 
2a ee J2n ak 
1 1 
—À —— —— ifReA>0 
e(x) exp (—Ax) a Pik if ReA > 
—(x— x’)? A —A2k2 (ik / 
exp ——— ex exp(—ikx 
p 242 p p 


24 1 Basics of Experience 


2a f(x) 


-2 0 2 x/a 


Fig. 1.7 Fourier transform (left, red) of the box function (right, blue). This is useful, e.g., for the 
refraction from a slit 


Fig. 1.8 Fourier transform (right) of the truncated exponential function f (x) = e(x) exp(—Ax) 
(left). This is useful for decay processes, if x stands for the time and k for the angular frequency. 
Here the dashed blue curve shows the real part and the continuous red curve the imaginary part of 
Af (k). The Kramers—Kronig relation relates these real and imaginary parts 


From the first example with x’ = 0, the Fourier transform of a constant is a delta 
function, and from the fourth example with x’ = 0, the Fourier transform of a Gaus- 
sian function is a Gaussian function again. The second relation is represented in 
Fig. 1.7 and the third in Fig. 1.8. 

Correspondingly, in three dimensions with k as wave vector (more on p. 137), we 
have 


ôk —k') = — f è exp{i(k—k’)-r}, 


f(r) = aa Í dk exp(+ik : r) f(k), 


1 i l 
f(k) = WE Jè exp(—ik-r) f(r). 


Here, d% is used for the volume element dV in real space and correspondingly dk 
for the volume element in reciprocal space. In Cartesian coordinates, we then have 
êr — r’) =d(x — x) d(y — y) b(z —- z’). 


1.1 Vector Analysis 25 


From the expansion 


1 
Jon 3 


of a vector field a (r), since Fourier expansions are unique, it follows that 


a(r) = J d*& exp(ik - r) a(k) 


V xa(r)=b(r) < ik x a (k) = b (k) 


and 
V.a(r)=b(r) 1> ik - a (k) = b (k) . 


If, for example, the curly bracket in f dk exp(ik - r) {ik x a (k) — b(k)} vanishes 
for all k, then of course the integral also does for all r. Rotation-free fields thus 
have Fourier component a (k) in the direction of the wave vector (longitudinal field 
along). In contrast, source-free fields have Fourier component a (k) perpendicular to 
the wave vector (transverse field Atrans). According to p. 4, the decomposition 


a(k) = e; (e; - a(k)) — e; x (e; x a(k)), with e = K ; 


therefore splits up into a longitudinal and a transverse part, i.e., into the vortex-free 
and the source-free part. 

Some important examples of Fourier transforms in the three-dimensional space 
are listed on p. 410. 


1.1.12 Calculation of a Vector Field from Its Sources 
and Curls 


Every vector field that is continuous everywhere and vanishes at infinity can be 
uniquely determined from its sources and curls (rotations, vortices): 


Vv’. / Vv’ / 
ai =—v fav ZER a yx fav SAAD. 
4n|r—r’| 4n|r-—r’| 


The first term here becomes fixed by the sources of a and, like every pure gradient 
field, is vortex-free, while the second, like every pure vortex field, is source-free and 
becomes fixed by the vortex of a. The operator V’ acts on the coordinate r’, while 
V acts on the coordinate r and therefore may be interchanged with the integration. 
The decomposition is unique. If there were two different vector fields a; and a2 
with the same sources and curls, then aj — a) would have neither sources nor curls, 
and in addition would vanish at infinity. But according to p. 16, a; = az has to hold. 


26 1 Basics of Experience 


To prove the claim, we evaluate V-aand V x a: 


=l IW! 1 1 
V-a=— | dv’ V’-a(r’) A—_—_., 
4r Ir—r’| 


=] rie ; 1 V’xa(r’) 
Vxa=— |av [v xat) A——_ yhy E 
4r Ir—r’| Ir—r’| 


Still, a (r) could contain a constant term, which would affect neither V - anor V x a, 
but a = 0 has to hold at infinity and this fixes this term uniquely. Now we show—and 
this is sufficient for the proof—that 


1 
Ir—r’| 


4r r-r’), 


and that the last term in V x a does not contribute. With r’ = 0 and recalling from 
Sect. 1.1.8 that Vr” = n r”? r, we have 


1 1 y. 1 3 -3 
Ae a = ( prey j= ( +r- >E 
r 


3 r? 3 r? r5 


This expression vanishes for r Æ 0. On the other hand, if we evaluate the source 
strength at the origin using Gauss’s theorem with a sphere of radius r > 0 around it, 


we have 
1 1 1 
favvi = favi =-5 fate =r. 
r r r 


This shows the first part of the proof, since (r — r’) vanishes for r Ær’ and 
fav (r — r’) is equal to 1. In addition, with b = V’ x a(r’), which depends only 
upon r’, but not upon r, we have 


Aaea Aa E 


Since V |r — r'|“! = —V'|r — r’|7!, this is equal to (b - V’) V'|r — r’|~!, and using 
Jv df - b a = fy dV {a V -b+ (b- V) a} (see p. 17), it therefore delivers 


b 1 
[wyp aE av ovv - 
V Ir—r’| V Ir—r’| 


1 1 
= df’-b V’ z- f aviv 7 
(V) Ir—r’| V Ir -r'| 


Since V’ - b = V’ - (V’ x a(r’)) = 0, the last integral does not contribute. For the 
surface integral, we take a sphere with sufficiently large radius r’. Its surface area is 
4rr”, while V|r —r’|~! is equal to r’~? there. Thus we only have to require that 
V x a vanishes at the surface with r’ —> oo and everything is proven. 


V’-b 


1.1 Vector Analysis 27 


According to the relation A|r — r'|! = —4z 5(r — r’) just proven, the solution 
of the inhomogeneous differential equation A® = ¢ (r) (Poisson equation) can be 
represented as an integral over the inhomogeneity ¢(r) with suitable weight factor. 
This is called the Green function G(r, r’) of the Laplace operator: 


-1 1 
AG, r) =ô8@-r) 4> Ge, r’)=— . 
4r |r—-r’| 


In particular, it yields the solutions of the differential equations 
A®=¢(r) and AA=a(r), 
ie., of V-V®=¢ and of V(V-A)—V x(V x A) =awith ®~ 0 andA~0O 


for r — oo. In electromagnetism, we shall meet them in the context of the scalar 
potential (Sect. 3.1.3) and the vector potential (Sect. 3.2.8). These solutions are 


pr) = fav Ga,r')¢(r’) and A(r)= fav Gr,r^) a(r’). 
By partial integration, they have the properties 
V= fav Gr, r^) Vor’, 
V-A= fav Gr,r^ V’-a(r’), 
VxA= fav Gir,r’) V’ x a(r’). 


Here, we used the fact that ® and A vanish at infinity, whence the inhomogeneities 
ġ and a vanish faster by two orders. Thus, if a is source- or curl-free, the solution A 
of the Poisson equation AA = a is likewise. 

The theorem proven in this section is called the principal theorem of vector anal- 
ysis. It assumes that the source and curl densities are known everywhere—these fix 
the vector fields. 


1.1.13 Vector Fields at Interfaces 


If V-aor V x a are different from zero only on a sheet, the volume integrals just 
mentioned simplify to surface integrals. Correspondingly, instead of V - aand V x a, 
we now introduce the surface divergence and surface rotation. They have different 
units from V -a and V x a, related to the area instead of the volume: 


28 1 Basics of Experience 


Fig. 1.9 View of a sheet of discontinuity of a vector field. Dashed red lines show the envelope 


v0 


1 
Diva=V,-a= lim f df -a , 
A J 


V) 
1 
Rota = V4 xa= lim z) df xa. 
A Jw) 


V>0 


Here, V is the volume of a thin layer, covering the latter surface A (see Fig. 1.9). 
Even though A is infinitesimally small, it nevertheless has dimensions that are large 
compared with the layer thickness, so only the faces contribute to the surface integrals 
of the layer. With n as unit normal vector to the face, pointing “from minus to plus”, 
we may then write 


Va:a= n. (a}—a_), 
Va xa= nx (a}—a_). 


Thus, if the vector field a changes in a step-like manner at a sheet (from a_ toa), then 
forða || n, ithas an area divergence (discontinuous normal component like, e.g., at the 
interface on the left in Fig. 1.5) and for ða L n, it has an area rotation (discontinuous 
tangential component like, e.g., at the interface on the right in Fig. 1.5). 


1.2 Coordinates 


1.2.1 Orthogonal Transformations and Euler Angles 


In order to perform sums, we now prefer to write e1, e2, e3 instead of ex, €y, €z. 
In addition, the coordinate origin will be assumed fixed here for every coordinate 
transformation. Displacements would be easy to include. 

For the transition from a Cartesian frame {e;, e2, €3} to one rotated about the 
origin {e;’, e2’, e3’}, we have 


1.2 Coordinates 29 


e;' = Ý (e;' - ex) ex = D> Dix & 
k 


k 


and 


/ / / 
er = > (ex-e) e’ = D> Dire. 
i i 
Since Qe = Ont = ez’ 3 e, 


> Dj Di = ôk = ps Di Dj, and in addition Dj, = Dik” . 


L I 


These equations may be written as matrix equations, if we understand D; as the 
element of the matrix D in row i and column k. Then, if D is the transpose of D 
(with Dig = Dy), we have 


DD=1=DD (soD7"!=D), andinaddition D = D* . 


This is called an orthogonal transformation. If D7! = D* = DÏ, the transforma- 
tion is unitary. Real unitary transformations are thus orthogonal transformations. 
Because det (D2D,) = det Dz - det Dı and det D = det D (see p. 5), orthogonal 
transformations have det D = +1. Depending on the sign, we distinguish between 
proper orthogonal transformations with 


det D = +1 


and improper orthogonal transformations with det D = —1. Only the proper ones 
are connected continuously to the identity and therefore correspond to rotations. 
But if we go over from a right- to a left-handed frame, then this is an improper 
transformation, in particular, D;k = —d;x, 1e., D = —1, corresponds to a space 
reflection (inversion or parity operation). 

Carrying out two rotations Dı and D> one after the other amounts to doing a single 
rotation D = D,Dy,, because DD= DD, Da Dı = D,D: D:D; = l and D Ď = 
DD; Dı D> = 1. However, the resulting rotation depends on the order, that is, in 
general Dı D2 4 DD), e.g., for finite rotations about different axes. 

For the Cartesian components of a vector a, we have 


ay=ep- a, ai =e; -a = J > Dik ay. 
k 


Instead of going over to a rotated coordinate system, we may also stick with the refer- 
ence frame and rotate all objects. In both cases we change the Cartesian components 
of every vector a. However, the rotation of an object through an angle œ corre- 
sponds to the opposite rotation of the coordinate systems, through the angle —a, and 


30 1 Basics of Experience 


Fig. 1.10 The Euler angles a, B, y, used to describe the transition from unprimed to primed 
coordinates. The dashed line is the line of nodes e; x ey. The sequence is black —> blue — green 
— red. The initial equator is black and the last one red 


vice versa. Therefore, with column matrices A’ and A and with the rotation matrix 
D, we write 
A’=DA, or a’=Da. 


Here, the second equation refers to a rotation of the vectors, because a and a’ should 
be fixed independently of the coordinate system. Correspondingly, we may also write 
the scalar product a - b as a matrix product AB of arow and of a column vector, for 
which their Cartesian components are necessary. Then we find DADB = ADDB = 
AB, implying that a’ - b’ = a - b, as it should be for a scalar product. (In the next 
section, we will obtain the scalar product for other coordinate systems.) 

Because of T = 1 the requirement DD = 1 constitutes six conditions in three 
dimensions, and 5 N(N + 1) conditions in N dimensions. Consequently, orthogonal 
transformations in three dimensions depend upon three real parameters. A rotation 
can be fixed uniquely by specifying these, e.g., by specifying the (axial) rotation 
vector in the direction of the rotation axis, with value equal to the rotation angle, 
or by specifying the three Euler angles a, B, y, with which one goes over from the 
original frame {e,, ey, ez} to the rotated one {e,’, ey, ey} (see Fig. 1.10): 


e The first Euler angle a fixes the azimuth, i.e., {e,, ey, €z} > {ez, e5, ez} with 
ez = e,, while the other axes move in a horizontal plane P4. 

e The second Euler angle £ describes the polar distance (motion of the z-direction), 
i.e., {e€z, eş, ez} > fez, ey, ey}, with ey = eṣ. The new ey and ey axes span 
a plane P, inclined at an angle £ to the horizontal. The two planes Pı and Pz 
intersect along ey = e5. 

e The third Euler angle y describes the rotation about the new Z’ direction, that is, 
{ex, ey, es} > {ex, ey, ey}, with e = ey, and the other axes moving on the 
plane P,. The common axis is along ex = ey, the so-called line of nodes. 


1.2 Coordinates 31 


The first two Euler angles are called the azimuth and polar distance of the new z-axis 
in the old system, while the third Euler angle gives the angle between the new y-axis 
and the line of nodes. This line of nodes forms a right-handed system with the old 
and the new z-axes. 

In some cases the Euler angles are defined differently, namely with a left-handed 
frame or the angles between the line of nodes and the x-axes instead of the y-axes, 
but the simple assignment of œ to the azimuth of the new z-axis is then lost. 

We now have 


D = Da Dg Dy 
with 
cosa — sina 0 cosB O sing 
Dy S | sing cosa 0| , Dg= 0 1 0 ; 
0 0 1 — sin B 0 cos B 


and D, like Dy, but y instead of œ, because Da and D, describe rotations about 
the (old) z-axis, Dg a rotation about the y-axis. If it were the coordinate system that 
were rotated, then every sine would have the opposite sign, because of the opposite 
rotation. Of course, starting from the Euler angles, we can evaluate the rotation vector, 
and vice versa, but we shall not discuss that here. Further properties are derived in 
Problems 2.1-2.3. 


1.2.2 General Coordinates and Their Base Vectors 


So far all quantities have been written in a coordinate-free manner as far as possible— 
Cartesian coordinates and unit vectors have occasionally been useful only for con- 
versions. Sometimes curvilinear coordinates are more appropriate, e.g., spherical 
coordinates (r, 0, g) or cylindrical coordinates (r, o, z), where circles also appear as 
coordinate lines. Still, for these two examples the coordinates are orthogonal to each 
other everywhere. We are thus dealing here with curvilinear rectangular coordinates. 
But we would like to allow also for oblique coordinates. These are convenient, e.g., 
for crystallography, and they also provide with a suitable framework for relativity 
theory. Curvilinear oblique coordinates are what restrict us the least. 

Even though a three-dimensional space is assumed throughout the following, most 
of the discussion can be transferred easily to higher dimensions. We shall hint at the 
special features of three-dimensional space in the appropriate place, namely, for axial 
vectors. 

As usual, from now on we will write (x!, x?, x3) = {x'} for the coordinate triple of 
coordinates, despite the risk here of confusing i with a power. In addition, instead of 
the Cartesian unit vectors, we introduce two sorts of base vectors. In crystal physics, 
g; is called a lattice vector and g' (except for a factor of 27) a reciprocal lattice 
vector, but restricted to linear coordinates with constant base vectors: 


32 1 Basics of Experience 


Fig. 1.11 Oblique coordinates are indicated here by lines with 6x! = 1. Shown are their covariant 
base vectors g; and also their contravariant base vectors g’. If gı and g2 form an angle y and if 
these vectors have lengths gı and g2, respectively, then the lengths of the contravariant base vectors 
are g! = 1/(g; sin y) (from g! - gk = 61). Oblique coordinates appear, e.g., if for unequal masses 
two-body coordinates are transformed to center-of-mass and relative coordinates (see Fig. 2.7) 


or 


eet 
contravariant base vectors (g i up) gi = Vx. 


covariant base vectors (g i down) g; = 


’ 


In these equations the index i on the right-hand side is really a lower or upper index. 

The covariant base vector g; is tangent to the coordinate line x! (all other coor- 
dinates remain fixed), and the contravariant base vector g' is perpendicular to the 
surface x’ = const. (all other coordinates may change) (see Fig. 1.11). For rectangu- 
lar coordinates, g; and g’ have the same direction, but for oblique ones, they do not. 
For rectangular coordinates the two base vectors generally have different lengths. 
Only for Cartesian coordinates are covariant and contravariant base vectors equal, 
viz., to the corresponding unit vectors (see Problems 3.10 to 3.12). 

The two scalar products 


or or 
oxi  ðxk 


git agi gk = Vx! Vx = ght , 


Sik = Bi ` Sk = = 8ki > 


depend on the chosen coordinates (because all base vectors depend on them), but not 
the scalar products of covariant and contravariant base vectors, 


1.2 Coordinates 33 


ər axk k o fori Ak, 


ee Sea ae rek. 


Covariant and contravariant base vectors each form an expansion basis. Therefore, 


also : l 
a=) g (g'-a)=) g (g-a), 


in particular, g% = >>, g; g'*, gk =}; g! gik, and 


J gng” = BB = 


This very decisively generalizes the decomposition into Cartesian unit vectors, not 
only to curvilinear, but also to oblique coordinates. With the useful concepts 


covariant component ofa: a; =g;-a 
and contravariant component ofa: a' =g -a 


and with a = J`; g; a' = }_, g! a;, we thus obtain 
a= > gna’, ad =Y g" a, and a-b=) a;b. 
k k i 


Covariant and contravariant components can be converted into each other, referred to 
as raising and lowering indices. With the scalar product, covariant and contravariant 
components always appear. We shall always meet sums of products where the index 
in the factors appears one up and one down. Therefore, we generally use Einstein’s 
summation convention, according to which, for these index positions, the summation 
symbol is left out. This is indeed what we shall do below (from Sect. 3.4.3 on). 


1.2.3 Coordinate Transformations 


New and old quantities are usually denoted with and without a prime, respectively. 
In view of various indices being added, a bar will be used instead of the prime in this 
book. 

With a change of coordinates, the behavior depends decisively on the position 
of the indices. Since 0/dx! = >, (ax* /dx') (a/dx*), on the one hand, and since 
we also have g’ - dr = dx! = $`, (0x! /dx*) dx*, with dx* = g* - dr, on the other, 
the transition x‘ — x! is connected to the following equations, the order of factors 
being irrelevant. Here the coefficients form a matrix, the row index being given by 
the numerator and the column index by the denominator: 


34 1 Basics of Experience 


_ ax* a əxi , 
B=) So. g a aoe 
g axk ax o, 


Here, &; = a - 8; and @' = a-@' . With the change of coordinates, the base vectors 
change, but not the other vectors a. Covariant and contravariant quantities have 
transformation matrices inverse to each other: 


ax’ ax* — ax! 


— = =. 
2 ðxk ƏXİ əxi ? 


The system of equations dx! = $, (ax! / ax") dx* can be written as a matrix equation: 


dx! ax! ax2 ax3 dx! 
ela (ea ae ce (ee 
dx? Əx! əx? əx? axe 


The transformation matrix is called the Jacobi matrix or functional matrix. Naturally, 
it also exists for space dimensions other than three. 

For two successive transformations, the two associated Jacobi matrices can be 
combined in a single product matrix. If the second transformation is the transforma- 
tion back to the original coordinates, then the result is the unit matrix: the inverse 
transformation is described by the inverse matrix. This exists only if the Jacobi 
determinant (functional determinant), viz., 


3T! əz! oe! 
EET 
aga) _ |i oxa oaa 
0 (x!, x?, x3) dx! ax2 əx | 


ax? ax? ax? 
ax! əx? dx3 


does not vanish, and likewise the determinant of the inverse Jacobi matrix, because 
the two coordinate systems should be treated on an equal footing. 


1.2 Coordinates 35 


1.2.44 The Concept of a Tensor 


We generalize the expressions derived so far for a vector field and denote as a tensor 
of rank n +m (with n covariant and m contravariant indices) a quantity whose 
components transform under a change of coordinates according to 


. i zu yim l ln P 5 
Pisin — 5 i dx ; dxi" OX OX” jion 


ky ...Kn : Oxi Ox jm Oxki Oxkn 1, ..Ln 
Ji-tn 


Scalars are tensors of zeroth rank and vectors are tensors of first rank. If T(x) is a 
scalar field, then the new function T (x) should have the same value for the coordinates 
x as the old function T (x) for the old coordinates x = f(x), whence we should have 
T (x) = T( f(x)) without further transformation matrices. In contrast, for a gradient 
field with VT; = VT - g;, because g; = dr/dx! and VT; = dT /dx', we have 


_ aT (x) dT (x) ax! ax! 
VT, = = ao VI; = 
t= op Laat oe 2 ax 
showing that this is a vector field. 
Tensors of the same type can be added, and the (tensor) product of a tensor of nth 
rank with a tensor of mth rank is a tensor of rank n + m: 


Tiin Teiokm — Piterinki km 


Of course, some covariant components may occur on the left- and right-hand sides. 
But one can also lower the tensorial rank by contracting the tensor: 


Nisin? = lain 
y Tak = hike 
i 


because covariant and contravariant components transform inversely to each other. 
(Here, too, the summation symbol is often left out, using the Einstein summation 
convention.) A special case of this is the scalar product of two vectors, 


X ab =a-b=) dbi) ab. 


Generally, a tensor of nth rank can be contracted with n vectors to produce a scalar. 
This fixes tensors in a coordinate-free way. In Sect. 2.2.10, for example, we shall 
introduce the moment of inertia 7, which is a tensor of second rank. The tensor 
product œw delivers the vector L (angular momentum) and 5 @-Lascalar (kinetic 
energy), where J is contracted twice with the vector w. 


36 1 Basics of Experience 


The trace of a square matrix is the sum of its diagonal elements: >, IŻ; = tr Z, 
which is the contraction of a tensor of second rank to a scalar. In fact, tr Z remains 
unchanged under a change of coordinates. 

The change of coordinates under a rotation on p. 30 led to the matrix equation 
A' = DA for a column vector A. Correspondingly, L = Iw reads L = IQ as a 
matrix equation where L and Q are column matrices and J is a square matrix. For a 
rotation we have L’ = DL, Q! = DQ,andQ = D~!Q’, respectively, and hence L’ = 
DID™'2',soL' = IQ with I’ = DI D~!. Here we now write L' = >>, 1‘, w* and 


= ax! ax! ax! əxi 
a ee Ey A paige ares 
> EF Iı, with 2 asi aF OR. 
The last equation corresponds to DD~! = 1. 
The quantities g’* and gj, introduced above are tensors of second rank. Since 


or oor |; i ak 
dr-dr = $` at age i= giz dx! dx* , 
ik ik 


we call (giz) the metric tensor. The matrices (g;,) and (g! K) are diagonal for rectan- 
gular coordinates, but not for oblique coordinates. With Cartesian coordinates, they 
are unit matrices. 

The indices of all tensors can be raised or lowered using the tensors gj, and gi*, 
as we have seen already in Sect. 1.2.2 for vectors. Similarly, 


T*S Pe" Tt = Ya 8" Ty, 
j jl 


and similarly, Tix = > jy gipguT'. 

If an equation holds in Cartesian coordinates and if it holds as a tensor equation, 
then it holds also in general coordinates. If a tensor of second rank is symmetric or 
antisymmetric, Tİ = +7", then it has this property in every coordinate system. 

The (scalar) triple product of the three base vectors g1, 22, g3 is denoted by £123. 
Generally, we have 


or or or a(x, y, z) 
Eik = Bi (Bi xg) = -( j= 


ðxİ \ðxİ > axk)  a(xi, xi, xk) ` 


This is the totally anti-symmetric (Levi-Civita) tensor of third rank. Under a change 
of coordinates, ¢;;, transforms like a tensor with three lower indices and changes 
sign for the interchange of two indices. Therefore, we only need to evaluate £123. 
This component can be traced back to the determinant of (gix) because, according 
to p. 5, we have 


1.2 Coordinates 37 


Si © 8; Si - 8j Si ` Bk 
{gi (8j x g)}? = |g; - Bi gj- Zj By g 
Sk ` Si Sk ` Bj Bk ` Bk 


The (scalar) triple product of three real vectors is always real, and only zero if they are 
coplanar (in which case the coordinates would be useless). Therefore, the determinant 
is positive. We thus have 


633 =+/g, with g = det(g,) >0, 


where the plus sign corresponds to a right-handed coordinate system and the minus 
sign to a left-handed one. (In particular, for a “reflection at the origin”, i.e., for 
x' — —x' for all i, the sign of £123 switches.) In addition, 


i jk 
ch = gl (gi xg) = “SO 


d(x, y, z) 
and hence, according to p. 5, 
l 
, a ô” ôr 
mn __ m gn 
Eijk E = 6; OF OF 
öp Of p 
We deduce that £123 £!” = 1, but also 
m n 
E ep ea le 
P ôk ôk 


and 
J Eijk ei? 2 ôr š 
ij 


This equation is often useful. 

The last paragraph is true only in three-dimensional space. Only there is the 
vector product determined uniquely—otherwise the direction perpendicular to two 
given directions is not determined. (But a totally antisymmetric tensor can also be 
introduced for spaces of different dimensions via the functional determinant.) 

Hence, in three dimensions we have 


gk x g=) g eim and a x b=} gatb ex. 
i ikl 


38 1 Basics of Experience 
The volume element is the parallelepiped spanned by the line elements (dr/dx!) dx‘, 


dV = |g; - (g2 x g3) dx! dx? dx3| = [e123 dx! dx? dx3] = J/g |dx! dx? dx3| 
0 + kd 
= Cee dx! dx? dx? 
ə (x!, x2, x3) 
In addition to |dx! dx? dx3|, the functional determinant of the associated coordinates 
appears. 


The area element df(1) is related to the vector g! which is perpendicular to the 


area x! = const. of the parallelepiped. Its scalar product with the vector gı dx! results 


in £1233 dx! dx? dx?. Hence, we infer that 
df(1) = go x g; dx? dx? = £1233 g! dx? dx? , 
with the value d f (1) = \/g g!! |dx? dx?|. As we shall soon see, these expressions 


are useful for vector analysis—and, by the way, also for relativity theory. (Of course, 
cyclic permutation of the three numbers 1, 2, 3 is allowed in this paragraph.) 


1.2.5 Gradient, Divergence, and Rotation in General 
Coordinates 


For general coordinates, we find the expressions 


i ; ow 
Vw=) g @-W=) g a7. 
Veastim =f dta a era) 

= — "a = — - a), 
v>0 V v) g > Ox! £ 
1 a , ow 


However, the corresponding surface integrals for gradient and rotation are not 
useful here, because df(i) can change its direction. Therefore, for the still miss- 
ing curl density, we start from Stokes’s theorem, viz., f df - (V x a) = f dr - a = 
$ $; ai dx’, and hence infer the equation /g g! - (V x a) = ða3/əx? — ðaz/əx?. 
Since /g £!” = —,/g £! = 1, we may also write /g Xy £ 3aı/3x* for the 


right-hand side: 
; ða; 
Vxa= y gg. 
k 
ikl 9x 


Now we have all the quantities mentioned in the title (and the Laplace operator) in 
general coordinates. 


1.2 Coordinates 39 


Fig. 1.12 Spherical coordinates r, 0, p and their unit vectors, with gı = e, = r/r (red), g2 = r eg 
(blue), and g3 = r sin 0 eg (green). Here, the angles g and 6 correspond to the “meridian” and 
“latitude”, respectively, in geodesy. However, the polar distance 0 is measured from the north pole 
(always positive), and the “latitude” from the equator 


For rectangular coordinates, much is simplified here. In particular, (g;;.) and (g! ky 
are diagonal, and g; and g! have the same direction e;. Only their lengths are different: 


1 
ej = — =g gi, with a ER ge and gi >O. 


Hence, dr = >>; e; g; dx’ and /g = g1 g2 g3, together with a; = (a-e;) g; and 
at = (a-e;) / gi. We thus obtain 


1 ow 
Vy = e; — atl 
5L ea 
1 ð : 
V.a= y eee, 
818283 G dx! 
1 ð 218283 OW 
Ay = D i 2 rs 
812283 Ox' gi Ox 
1 a f] 
e- (YV xa)= ( = =) (and cyclic permutations). 
g2g3 \dx2 3x? 


The most important examples are, on the one hand, spherical coordinates, for which 
dr = e, dr +eg r d0 +e, r sin 0 dg (see Fig. 1.12): 


ay 1 ay 1 dy 
Vw =e, 
J eae Paine ae 
1 dra, 1 dsin@ ap 1 day 


r? ar © sine a0 ETT dg 


40 1 Basics of Experience 


Fig. 1.13 Cylindrical coordinates R, ø, z and their unit vectors. Instead of the Cartesian coordinates 


x, y, the polar coordinates R and y appear, so g3 = e; (green), g1 = er = R/R (red), and g2 = 
Reg = ez x R (blue) 


1 ə a 1 a a 1 a? 
Av = r? w - sin 0 4 T ia s 
r? ər ðr r?sinð 00 30  r?sin?0 ag? 
1 ð sin 0 a 1y 1 0a ð 
Vxa=e, — ( sin ô dg “jte ( a =) 
r sin 00 dg r \sin@ dg or 
1 /orag ða, 
+ & > ( or AL 


and on the other hand, cylindrical coordinates, for which dr = eg dR +e, Rdg + 
e, dz (see Fig. 1.13): 


aw 1 oy aw 
Vy = z F 

ER aR Teg R ðọ te az 

1 dRapr 1 da, az 
R aR R dp az’ 

1 3 dW 1 y ay 
A = R ’ 
" RƏR ƏR R? 09? + az? 
1 da, da ðar ða 1 /əRa ðar 

masa a oo oe: 

a er (7 ag Oz Heel OR +e g aR dg 


V-a= 


In many cases, the fields y or a depend only on r (isotropy) or R (cylindrical 
symmetry), respectively—then we need only ordinary derivatives in spherical or 
cylindrical coordinates, instead of partial derivatives. 

For rectilinear coordinates there are also simplifications with constant base vec- 
tors, because then g remains the same everywhere: 


1.2 Coordinates 41 


aw 
Wie, 
da’ 
V.a= i 
a 2 ax! 
ik ry 
av=), E oxi axk ’ 
ik 
da 1 sda da 
(V xa)! = x ei eg, (Vxa)! = o aa) 


The next section should only be read by those who want to enter into more detail— 
it is not needed to understand the following. Section 1.2.7 will be important only for 
thermodynamics. 


1.2.6 Tensor Extension, Christoffel Symbols 


In deriving a gradient field from a scalar field, the rank of a tensor increases by one. 
This tensor extension through differentiation also arises for tensors of higher rank, 
but in this case variable base vectors require additional terms. In particular, we have 


OB Di (e: =) =e (kl, i} 
= 8 G e) = 2g lal 


with the Christoffel symbols of the first kind 


i Ox 3r gik Ogi . 
kl = ~~- 2) => -nny g8 ES = š = Ik, 
ai ax! 8 axk ax! 8 n k Jx! ist 

O Lik O8ik gü = 98x 
La a *) 
age 5 ( ax! | axk Oxi 


and the Christoffel symbols of the second kind 


i) o Of |; or ; i 2 l 
fa) ~ ax! = axk ax! 8 = {if =e" (ki, j} . 
j 


Despite the last equation, the new symbols are generally not tensors of third rank, 
because they contain second derivatives. Therefore, we shall avoid the notations Ix; 
for {kl, i} and r for {i} 

From these equations, it follows immediately that 


42 1 Basics of Experience 


O8kk 
Tem T kl, k}. 
> axl { } = {kl, k} 


For rectangular coordinates, all g;, with i ~ k vanish. If 5 dLkk/ ax! = grk { r) holds, 


andin addition fork Æ l, this is equal to —{kk, 1} = —gy {k}, because gx = 0. Since 
g; - g“ is constant, we have finally 


ag i ag ¡(Zi x i fk 
ee (55) ==) e F #)=-)8 fa 
For the derivatives of the vector field, we have 
ða k ða , oa 
gal 7 D8 (Be ga) = Lee (a) 
These coefficients are referred to as covariant derivatives: 


= da E dak dgk = dak i 
SEBIS ax! ax! = ax! ax! 2 lit f 


ða dak ag* dak ; [k 
k= gk š — a = l 
PAE ax! ax! a ax! ra Vale 


They are clearly tensors of second rank, obtained by differentiation from tensors of 
first rank. 
These observations can be applied to the velocity and acceleration. Since 


dg; yn ôg; dxi ae 
ar T 2 ox y DUB ij) dt’ 


we obtain 


dr dr dx k dx* 
7 = La axl di -De S > Sg’ 
and since a = ý = `, (g ¥* + ġ; ž*), we find 


Fa d?x* "X; ee dx/ 
T de ij) dt dt ` 


For motion along the coordinate line x*, the first term ¥* here describes the tangential 
accelerations and the rest the normal accelerations. This decomposition was already 
explained on p. 7. 


1.2 Coordinates 43 


1.2.7 Reformulation of Partial Differential Quotients 


In the analysis of functions of multiple variables, partial derivatives appear. Here we 

restrict to two variables for reasons of simplicity, but generalization is straightfor- 

ward. The main interest here is in the transformation to new variables (coordinates). 
For a function f of the two variables x and y, we have 


= Of (x, y) PTRS of (x, y) an 


df (x, y) F ay 


It is common to leave out the arguments of f and instead attach the fixed parameter 
to the differential quotient as a lower index. Hence the equation appears in the form 


df = (L) as + (Z) a , with ESP =0 and (2), =A 


From here various relations can be derived. 
If we divide by d f and form the limit df —> 0 with constant y or x, respectively, 
i.e., for dy = 0 or dx = 0, respectively, then 


Lea) ap 
~ Nax/y\afly — \ay/x\af 7x” 
The derivative of a function is thus equal to the reciprocal of the derivative of 
the inverse function, as suggested by the notation (due to Leibniz). On the other 


hand, if we divide by dy and form the limit dy — 0 for fixed f, we have 0 = 
(0f/dx)y (0x/dy) ¢ + (0f/dy),, whence the noteworthy equation 


Gea) 


We thus see that the fixed and the changed variable can be exchanged. This equation 
may also be written in the form (3f/3x)y (dx/dy) ¢ (3y/3f)x = —1 if we consider 
the reciprocal of (df/dy),. 

If we replace a variable with a new one, e.g., y with g(x, y), then from 
df = (Əf/Əx)ş dx + (0f/0g), {(dg/dx)ydx + (dg/dy),dy}, we may deduce the 
two important equations 


(ss), = Ge), * Ge). Ge), 
(55). = Gp) (G5). 


and 


44 1 Basics of Experience 


According to the first equation, the fixed variable can be changed. The second cor- 
responds to the chain rule for ordinary derivatives. 

In the last product, if we swap the fixed and adjustable pair of variables and then 
apply the chain rule twice, it follows that 


Tnn 


Here, the pair (f, g) is exchanged with the pair (x, y). By the way, 


0 ð 0 0 
Graig 
dg/x\df/y dg/x\dy/ f 
For the proof, we can trace (0f/dg), back to (df/dg), and then exploit the equations 
above. 


If in (0f/dg), we use the chain rule with the variable y and then in (dy/dg) 
exchange the fixed and the adjustable variable, we also have 


aaa 


This corresponds to the replacement y <> g. In addition, 


Cease 


This can be understood by replacing x <> g in the first factor. 


1.3 Measurements and Errors 


1.3.1 Introduction 


The search for laws prepares the ground on which the principles of nature are built. 
We generalize by relating comparable things. Of course, this has its limitations. When 
are two things equal to each other, and when are they only similar? The following is 
important for all measurements, but also for quantum theory and for thermodynamics 
and statistics. 

We consider an arbitrary physical quantity which we assume does not change with 
time and can be measured repeatedly, e.g., the length of a rod or the oscillation period 
of a pendulum. Each measurement is carried out in terms of a “multiple of a scale 
unit”. It may be that a tenth of the unit can be estimated, but certainly not essentially 
finer divisions. An uncertainty is therefore attached to each of our measured values, 
and this uncertainty can be estimated rather simply. 


1.3 Measurements and Errors 45 


r—Az © ae 


10.1 102 103 gy, 


Fig. 1.14 Frequency distribution of the measurement series {x,,} mentioned in the text. The more 
often the same value is measured, the higher the associated column (blue). The adjusted red bell- 
shaped curve is symmetrical with respect to the mean value (x = 10.183), and the half-width 
(Ax = 0.058) corresponds to the “measurement error” 


It is more difficult to find a statement about how well an instrument is adjusted 
and whether there are further systematic errors. We will not deal with these questions 
here, but we do want to be able to estimate the bounds on the error from the statistical 
fluctuations of our measured data. 

In particular, if we repeat our measurement in order to ensure against erroneous 
readings, then the values x, (n € {1, ... , N}) may not all be equal, e.g., we may 
find three times 10.1 scale units (that is, three values with 10.05 <x, < 10.15), eight 
times 10.2 (eight values with 10.15 < x, < 10.25), and one 10.3 (with 10.25 < x, < 
10.35) in an arbitrary order. Apparently, there are always “measurement errors”, the 
origin of which we do not know. (Systematic errors can be estimated separately.) 
Therefore, we have to assign a greater uncertainty than the assumed scale fineness 
to the results of our measurements. 

Hence, from the N readings {x,,} of our measurement, we would like to determine 
a measurement result with error estimate in the form x + Ax. For the example 
mentioned, the result is 10.183 + 0.058, as will be shown shortly, often abbreviated 
to 10.183(58). This example is shown in Fig. 1.14. The error estimate here presents 
only a frame for the actual error: improved measurement readings may also lie 
outside the error limits given previously. If we compare, e.g., the error analysis for 
the fundamental constants of the year 1999 (see p. 623) with the ones from 1986, we 
obtain Table 1.3. Only the value for the Boltzmann constant k has remained within 
the old error limits. The Avogadro constant (N4) and Planck constant (h) came to lie 
outside the old limits, as did the value for the elementary charge e. The error limits 
for the gravitational constant G even went up by more than two orders. 

It is pointless to give the error to more than the two leading digits, and the mean 
value more exactly than the error. This is forgotten by many laypeople, if they com- 
municate their computational result “exactly”, with far too many digits. 


46 1 Basics of Experience 


Table 1.3 Improvement of precision with time 


Quantity Relative uncertainty | Relative change Relative uncertainty 
1986 in 1077 1999/1986 in 1077” | 1999 in 10~7 

e +3 —5.6 +0.4 

h +6 —10.3 +0.8 

Na +6 +8.6 +0.8 

k +87 —56 +17 


1.3.2 Mean Value and Average Error 


After N measurements of x, we have a sequence of measured values {x;,..., xy}. 
These values are generally not all equal, but we want to assume that their fluctuations 
are purely random, and we shall only deal with such errors in the following. 

Since none of the measurement readings should be preferred, the true value xo is 
assumed to be near the mean value 


12 
x= yt 


n=1 


because deviations may occur equally often to higher or lower values: x9 ~ x. Our 
best estimate for the true value xo is the mean value x. 

Here, the less the values x, deviate from x, the more we trust the approximation 
xo * x. From the fluctuations, we deduce a measure Ax for the uncertainty in our 
estimate. To do this, we take the squares (x, — x) of the deviations rather than their 
absolute values |x, — x |, because the squares are differentiable, while the absolute 
values are not, something we shall exploit in Sect. 1.3.7. However, we may take their 
mean value 


(x —x)? =x? —2xx +x" =x 


as a measure for the uncertainty only in the limit of many measurements, not just a 
small number of measurements. So, for a single measurement nothing whatsoever 
can be said about the fluctuations. For a second measurement, we would have only 
a first clue about the fluctuations. In fact, we shall set 


N 
N-1 


N 
(Ax)? = —_ > Gn - 2) = (x —x)?, 


n=1 


as will be justified in the following sections. Here we shall rely on a simple special 
case of the law of error propagation. But this law can also be proven rather easily in its 
general form and will be needed for other purposes. Therefore, we prove it generally 
now, whereupon the last equation can be derived easily. To this end, however, we 
have to consider general properties of error distributions. 


1.3 Measurements and Errors 47 


1.3.3 Error Distribution 


We presume that the errors are distributed in a purely random manner. Then the 
error probability can be derived from sufficiently many readings of the measure- 
ment (N > 1). From the relative occurrences of the single values, we can determine 
the probability o(¢) de that the error lies between £ and € + de. The probability 
density p(€) is characterized essentially by the average error o, as the following 
considerations show. 

Each probability distribution p has to be normalized to unity and may not take neg- 
ative values: f p(e)de = land p(e) > O forall e (€ R). In addition, we expect p (£) 
to be essentially different from zero only for € ~ 0 and to tend to zero monotonically 
with increasing |e|. The distribution is also assumed to be an even function, at least in 
the important region around the zero point: o (£) = po(—€). Hence, f e p(e)de = 0. 
The next important feature is the width of the distribution. It can be measured with 
the second moment, the average of the squared errors o (> 0), also called the mean 
square fluctuation or variance, 


o = [eve de. 


Note, however, that the mean square error is not finite for all allowable error distri- 
butions, e.g., for the Lorentz distribution p(€) = y {r (e? + y7)}, which is instead 
characterized by half the Lorentz half-width y—more on that in the discussion around 
Fig. 5.6. 

From the probability distribution p(¢), we can evaluate the expectation value 
(f) of any function f(e). Each value of the function is summed with its associated 
weight: 


(f) = f f(e) ple) de . 


In particular, (£”) = f e" p(e) de. 

For the error distribution in the following we use only on the properties (e°) = 1, 
(e!) = 0, and (£?) = o?, among which only the middle one might be disputed—the 
first is obvious by normalization, the last fixes the average error o. 

If, however, we want to write down the probability W (à) for an error |e] < Ao, 
i.e., 


ho 
wa) = J pie; 


ho 


we have to know p(¢) in more detail. Detailed statistical investigations suggest the 
normal or Gaussian distribution—this will become apparent in Sect. 6.1.4: 


exp (—he?/o?) 


pley= oe 


48 1 Basics of Experience 


aoe) 


0.4 


0, 


4°23 2 A 0 1 2 3 4 «jo 


Fig. 1.15 Normal distribution of the error. Gauss function (bell-shaped curve). In order for all 
average errors o (> 0) to result in the same curve, the probability p for the error ¢ times the average 
error as a function of the ratio ¢/o is shown here. The area is unity for all o 


W(A) 
1.0 


0 1 2 3 4X 


Fig. 1.16 Error integral W (à) (blue): the probability of errors with |e| < Ao (ø is the average 
error). The dashed red curve is the function tanh(,/2/7) for comparison 


Figure 1.15 shows this function and Fig. 1.16 the associated error integral W(A). 
The error integral is related to the error function 


erfx = az f ery dy = W(V2x), 


for which the following expansions are useful: 
2 00 (= x2ntl 
J 2 n! Intl’ 


exp (—x2) (—)" (2n + 1)! 
ax (1 2 2x2)" + 


erfx = 


1 


) for x > 1. 


1.3 Measurements and Errors 49 
The second series is semi-convergent, i.e., it does not converge for n — oo, but 
approximates the function sufficiently well for finite n (< x). 


From Fig. 1.16, we see that, for the normal distribution, slightly more than 3 of 
all values have an error |e| < o and barely 5% an error |e| > 20. 


1.3.4 Error Propagation 


We now start from K physical quantities x, with average errors og and consider the 


derived quantity y = f (x1, ..., xx). Here all the quantities x, will be independent 
of each other. What is then the average error in y? 
To begin with, the error ¢ in f (x1, ... , Xx) is to first order 
K 
af 
E= az Ek, 
and hence 
o? = (2°) 
[e] K af 2 
-ff Z = ex) p(E1,..., EK) de, -+ -dex 
—oo Oxx 
k=1 
K K K 
af af af of 
=( = 4 = a)= 5 z ao (€x €1) 
ja *k et 0" pler OME o 


Since the quantities x, and x; should not depend upon each other, they are not 
correlated to each other (the property x; does not care how large xg is—correlations 
will be investigated in more detail in Sect. 6.1.5). With (£?) = og? this leads to 


fiele) fork Al, 
(exer) = | og fork=l. 


Here, ( €z} = Oholds for all k (and /). Therefore, the law of error propagation follows: 


K af 2 
mayan) 


k=1 


In the proof, no normally distributed errors were necessary—thus other distributions 
with the properties (£?) = 1, (£g!) = 0, and (8%?) = o? deliver the error propa- 
gation law and with it the basis for all further proofs in this section. In particular, 
we may invoke this law for repeated measurements of the same quantity, as we shall 
now do. 


50 1 Basics of Experience 


1.3.5 Finite Measurement Series and Their Average Error 


If we consider the expression 


1 
(x) x = K Xorn 
n=1 
asx = f(x), ..., Xy), then we can use it in the law of error propagation and deduce 


that df/dx, = N~'. Hence, all single measurements enter into the error estimate 
with the same weight—as already for the estimated value xo. 

In order to determine the error o,,, we think of an average over several measurement 
series, each with N measurements. In this way, we can introduce the average error of 
the single measurement and find that all single measurements have the same average 
error Ax. Therefore, the law of error propagation for N equal terms N~? (Ax)? 


delivers r 
A 
(Ax)? =N- N”? (Ax)? = | D : 


The average error AX in the mean value of the measurement series is thus the VN th 
part of the average error in a single measurement: the more often measurements are 
made, the more accurate is the determination of the mean value. However, because 
of the square root factors, the accuracy can be increased only rather slowly. 

Since we do not know the true value xo itself, but only its approximation x, we 
still have to account for its uncertainty AX in order to determine the average error of 
the single measurement: 


(Ax)? = (x — x9)? = œ —¥ + ¥ — x0)? = (x —¥)2 +2 — X) E — xo) + Œ — x0)? . 
Here, x — x = x — X = 0 and thus (x — x)? = x? — X” is rather easy to evaluate. For 


(x — xo)”, we take (Ax)? = (Ax)? /N. Hence, because 1 — NT! = NT! (N — 1), the 
average error of the single measurement is 


ma)? , 


N 
Ax% = 
(Ax) = FL] 
as claimed previously (see p. 46). And so we have the announced proof. For suffi- 
ciently large N, we may write (Ax)? = x? — x°. The expression Ax is referred to 
as the uncertainty of x in quantum theory (see p. 275). 


1.3.6 Error Analysis 


How should we modify the result obtained so far if the same quantity is measured 
in different ways: first as x; + Axı, then as x2 + Ax, and so on? What is then the 
most probable value for xo, and what average error does it have? 


1.3 Measurements and Errors 51 


If the readings of the measurement were taken with the same instrument and 
equally carefully, the difference in the average errors stems from values x, from 
measurement series of different lengths. According to the last section, the average 
error of every single measurement in such a measurement series should be equal to 
AxnJ/N,, and this independently of n in each of the measurement series. Therefore, 
the mentioned values x, should contribute with the weight 


Wer 1l 1 
SNe com 3 (Ax) ’ 


whence ¥ = }_,„ Pn Xn is the properly weighted mean value. The error propagation 
law delivers 


2 22 1 1 2 Ep An? 
= n On = Ala) = e E 
i Le Ean? 2 xy = 0 amy? 


a 1 = Io y 1 
Yn Aan)? Gay aay 
The more detailed the readings of the measurement, the more important they are for 
the mean value and for the (un)certainty of the results. These considerations are only 
then valid without restriction, if the values are compatible with each other within 
their error limits. If they lie further apart from each other, then we have to take 


=)2 
(Ax)? = 1 1 = y (Xn = l 
N-1 $ (Ax,) = (Ax) 

Note that, if the values x, do not lie within the error limits, then systematic errors 
may be involved. 

Thus, these two equations answer the questions raised in the general case, where 
measurements are taken with different instruments and different levels of care: to 
each value x,, we must attach the relative weight | / (Ax,)?. 


1.3.7 Method of Least Squares 


A further generalization is necessary if the readings of measurement happen to be 
along a straight line, but scatter about it due to random errors. What are then the 
best values a and b for {y, = ax, + b}? More generally, we can fit a power series, 
a Fourier series, or some series of known functions. 

We always want to determine the readings of measurements as precisely as pos- 
sible, in order to make the average error as small as possible. This requirement is 
effective under general conditions. Thus the values a and b of the fitting line are to 
be determined from the conditions 


52 1 Basics of Experience 


Fig. 1.17 Example of a fitting line through 12 pairs of measurement values (e). The continuous line 
shows y = ax + b, and the dashed lines show the upper and lower error limits (a+ Aa) x + (b+ Ab) 
and (a— Aa) x + (b— Ab), respectively. (In a beginners’ lab course, we can thus establish Hooke’s 
law by showing how the length y of a copper wire depends linearly upon the load x) 


N 
>On — axn — b)? = min(a, b) , 


n=1 
ie., 3 ^1 On — ax, — b)? /ða = 0 = 9 Y^ (yy — ax, — b)? /ðb. From the last, 


we have 
O Linn — ax) 
N 


b =y-ax, 


and from the condition above, 


v-z _ @—HO-Y) 


x2 — x (x — x)? 


Here, the first fraction is easy to evaluate, the last less easy to interpret—hence the 
reformulation. We have thus answered our question as to which values for a and b 
are the best. For an example, see Fig. 1.17. 

To calculate the average errors Aa and Ab, we have to consider the fact that 
pairs of values (x,, y,) are always associated and only the error in each pair counts, 
not the error in x, and y, separately. Therefore, for reasons of simplicity we take 
the error in x, as an additional error in y, and then take a and b in the law of error 
propagation as functions of y,. From b = y — ax, it then follows that 


(Ab)? = (Ay)? +X? (Aa)? . 


From a = {$}, (tn — X) yn} /{N x2 — ¥°)}, we obtain 3a /3yn = (xn — ¥)/{N x? — 
XD}, and thus, finally, 


1.3 Measurements and Errors 53 


L m- 
(Aa)? = X —“2_——___ (Ay,,) . 
32 N2 (x2 = x?)2 
If all errors Ay, in the pairs of values are equally large, then (AV)? = (Ay)?/N and 
hence 
(Ay) 


———— and = (Ab)? =x? (Aa)*. 
N (x? — x°) 


(Aa)? = 


We still lack a prescription for calculating the average error Ay in a single measure- 
ment. From the original equation y = ax + b, where two pairs of values are now 
necessary in order to determine a and b, we have 


More generally, with K parameters we would have the denominator N — K, because 
the equation (Ax)? = (Ax)?/N in one dimension becomes (Ay)? = K (Ay)?/N in 
K dimensions, and this can then be used in (Ay)? = (y — y) + (Ay). 


List of Symbols 


We stick closely to the recommendations of the International Union of Pure and 
Applied Physics (IUPAP) and the Deutsches Institut fiir Normung (DIN). These 
are listed in Symbole, Einheiten und Nomenklatur in der Physik (Physik-Verlag, 
Weinheim 1980) and are marked here with an asterisk. However, one and the same 
symbol may represent different quantities in different branches of physics. Therefore, 
we have to divide the list of symbols into different parts (Table 1.4). 


Table 1.4 Standard notation and symbols 


Name Page reference 


Time 


Position vector 


Volume 


Surface of a volume V 


Area 


Boundary of an area A 


Path element vector 


Surface element vector 


Volume element 24 


Scalar product of a and b 


Bl WwW] wo] wolr]wo] wo] olopyReyr 


Vector product of a and b 


(continued) 


54 1 Basics of Experience 


Table 1.4 (continued) 


Symbol Name Page reference 
* ab Dyadic product of a and b 11 
* ey Unit vector x/x 3 
x? v Nabla operator 10 
x? Vo Gradient of a scalar field @ 10 
x? V.a Divergence of a vector field a 11 
x2 Vxa Rotation (curl) of a vector field a 13 
Vaca Area divergence of a vector field a 27 
Vaxa Area rotation of a vector field a 27 
* A Laplace operator 15 
* Ôik Kronecker symbol 18 
* d(x) Dirac delta function 18 
* dx Variation of x 58, 139 
* elx) Discontinuity function (theta function, step function) | 18 
PI Principal value of 7 19 
* k Wave vector 24 
* D Transpose of the matrix D 3 
* D! Inverse of the matrix D 29 
* D* Conjugate of the matrix D 29 
* Di Adjoint of the matrix D 29 
* det D Determinant of the matrix D 5 
* trD Trace of the matrix D 36 
gi Covariant base vector dr/d.x! 31 
gi Contravariant base vector Vx! 31 
Qj Covariant component of a 33 
a Contravariant component of a 33 
* x Mean value of x 46 
Ax Uncertainty in x 46 


‘Total differentials are written with an upright d rather than an italic d. We stick to this throughout. 
?In the recommended notation there is no vector arrow above V, even though it is a vector operator. 


Suggestions for Further Reading 


1. J. Arfken, H.J. Weber, Mathematical Methods for Physicists, 6th edn. (Elsevier Academic, 
Burlington MA, 2005) 

2. E. Ph Blanchard, Briining: Mathematical Methods in Physics: Distributions, Hilbert Space 
Operators, and Variational Methods (Springer Science + Business, Media, 2003) 

3. S. Hassani, Mathematical Physics-A Modern Introduction to Its Foundations (Springer, Berlin, 
2013) 

4. A. Sommerfeld: Lectures on Theoretical Physics 6-Partial Differential Equations in Physics 
(Academic, London, 1949/1953) 

5. H. Triebel, Analysis and Mathematical Physics (Springer, Berlin, 1986) 


Chapter 2 A) 
Classical Mechanics E 


2.1 Basic Concepts 


2.1.1 Force and Counter-Force 


The best known example of a force is the gravitational force. If we let go of our book, 
it falls downwards. The Earth attracts it. Only with a counter-force can we prevent 
it from falling, as we clearly sense when we are holding it. Instead of our hand, we 
can use something to fix it in place. We can even measure the counter-force with a 
spring balance, e.g., in the unit of force called the newton, denoted N=kg m/s?. 

Each force has a strength and a direction and can be represented by a vector—if 
several forces act on the same point mass, then the total force is found using the 
addition law for vectors. As long as our book is at rest, the gravitational and counter- 
force cancel each other and the total force vanishes. Therefore, the book remains in 
equilibrium. 

Forces act between bodies. In the simplest case, we consider only two bodies. It 
is this case to which Newton’s third law refers: Two bodies act on each other with 
forces of equal strength, but with opposite direction. This law is often phrased also as 
the equation “force = counter-force” or “action = reaction”, even though they refer 
only to their moduli. If body j acts on body i with the force F;;, then 


F;; = =F; . 


According to this, no body is preferred over any another. They are all on an equal 
footing. 
We often have to deal with central forces. Then, 
Tij 


F;; = FF (ry) ei; 5 with ej = — 


and Yi; =Tr;—Trj=-fTji, 
Fij 


© Springer Nature Switzerland AG 2018 55 
A. Lindner and D. Strauch, A Complete Course 

on Theoretical Physics, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-04360-5_2 


56 2 Classical Mechanics 


Fig. 2.1 The force can only be derived from a potential energy if the work needed to move against 
the force from the point ro to point rı does not depend upon the path between these points, i.e., 
only for $ F - dr = 0 for all closed paths 


where we have a minus sign for an attractive force and a plus sign for a repulsive 
force (see Fig. 1.1). Clearly, they have the required symmetry. 

The force between two magnetic dipoles m; and m; is not a central force, but a 
tensor force: 


3uo M; :ẹ;j mj: ej; m;-m; — 5m;-e;; mj -€;; 

ij = z Mj z M; + 7 eij 

An Vij Fij Vij 

This expression is derived in Sect. 3.2.9 and presented in Fig. 3.12. It depends on the 

directions of the three vectors m;, mj, and r;;, but also has the required symmetry. 
Newton’s third law also holds for changing positions rj; (t), but we shall deal with 

this in the next section. For the time being, we restrict ourselves to statics. The total 


force of the bodies j on a test body i is thus 
F)=) Fj. 
J 


This force will generally change with the position r; of the test body, if the other 
positions r; are kept fixed. We now want to investigate this in more detail and write 
r instead of r;. 


2.1.2 Work and Potential Energy 


It may be easier to work with a scalar field than with a vector field. Therefore, we 
derive the force field F(r) from a scalar field, viz., the potential energy V(r): 


F=-VV. 
But for this to work, since V x VV = 0, F has to be curl-free, i.e., the integral 


$ F - dr has to vanish along each closed path. We conclude that a potential energy 
can only be introduced if the work 


rı 
a= F-dr 
To 


2.1 Basic Concepts 57 


depends solely upon the initial and final points rọ and r; of the path, but not on 
the actual path taken in-between (see Fig. 2.1). (Instead of the abbreviation A, the 
symbol W is often used, but we shall use W in Sect. 2.4.7 for the action function.) 
Generally, on a very short path dr, an amount of work 5A = F - dr is done. Here we 
write A instead of dA, because A is a very small (infinitesimally small) quantity, 
but not necessarily a differential one. It is only a differential quantity if there is a 
potential energy, hence if F is curl-free and can be obtained by differentiation: 


dV=VV-dr=—-F-dr=-—dA. 


For the example of the central and tensor forces mentioned in the last section, a 
potential energy can be given, but it cannot for velocity-dependent forces, i.e., neither 
for the frictional nor for the Lorentz force (acting on a moving charge in a magnetic 
field). We shall investigate these counter-examples in Sect. 2.3.4. 

If there is a potential energy, then according to the equations above it is determined 
only up to an additive constant. The zero of V can still be chosen at will and the 
constant “adjusted” in some suitable way. The zero of V is set at the point of vanishing 
force. If it vanishes for r — ov, then it follows that 


V(r) = - [Fe -dr’. 


But it should be noted once again that this is unique only for V x F = 0, that is, only 
then is there a potential energy. 

For a homogeneous force field, the force F does not depend on the position. 
Then the expression V = —F -r fits. Likewise, for a central force with F œ r”, the 
potential energy is easily found: 

F=cr'= => V= ptt 
r n+l 
if n A —1, otherwise V = —c ln (r/ro), with an arbitrary gauge constant rp. Note 
that V(co) = 0 holds only for n < —1. 

If we can derive the two-body force F;; from a potential energy V;;, we have (with 

rj kept fixed) 


Fj; = —ViVi; , with ViVij . dr; = dV;; j 
Newton’s third law now delivers — V; Vi; = V ; Vj; (on the right-hand side, r; is kept 
fixed and r; variable) with V; = —Vj, since rj; = —rj; here, so we have dr; = 
—dr. Therefore, with a convenient gauge, we can obtain the symmetry 


Vij = Vii « 


Hence a many-body problem has the total potential energy 


58 2 Classical Mechanics 


V=> Vis.) Ve 


i<j i#j 


because each pair (ij) is to be counted only once. (It is often taken for granted that V;; 
vanishes and the summation is then simply over i and j, without indicating i Æ j.) 


2.1.3 Constraints: Forces of Constraint, Virtual 
Displacements, and Principle of Virtual Work 


We can often replace forces by geometric constraints. If the test body has to remain 
on a plane, we should decompose the force acting on it into its tangential and normal 
components—because it is only the tangential component that is decisive for the 
equilibrium (as long as there is no static friction, since this depends upon the normal 
component). The normal component describes only how strongly the body presses 
on the support, e.g., a sphere on a tabletop. 

Geometrically conditioned forces are called forces of constraint Z. In equilibrium, 
the external forces cancel, whence )°; F; = 7; Z;. We now consider virtual changes 
in the configuration of an experimental setup. In our minds, we alter the positions 
slightly, while respecting the constraints, in order to find out how much of it is 
rigid and how much is flexible. These alterations (variations) will be denoted by 
or . If there is no perturbation due to static friction, then the forces of constraint are 
perpendicular to the permitted alterations of position, and therefore the displacement 
dr does not contribute to the work. Since Z; - dr; = 0, we find the extremely useful 
principle of virtual work: 


YF; - dr; =0. 


In equilibrium, the virtual work of the externally applied forces vanishes. We do 
not need to calculate the forces of constraint here—instead, only the geometrical 
constraints must be obeyed. If only curl-free forces are involved, then the associated 
potential energy of the total system also suffices. Equilibrium prevails if it does not 
change under a virtual displacement: VV - ôr = V = 0. 

For a lever, the virtual work can be evaluated in a particularly easy way with a 
virtual rotation, because the length R of the lever arm does not change, and therefore 
we may set dr = dg x R, if 5g points in the direction of the axis of rotation (right- 
hand rule) and if R points from the axis of rotation to the point where the force acts 
(see Fig. 2.2). For the virtual work we obtain therefore 


6A =F-dr=F-(69x R)=(RxF)-d9=M- dg, 


with the torque M of the force F on the lever arm R defined by 


2.1 Basic Concepts 59 


Fig. 2.2 Lever law. The rigid lever transmits all those forces to the axle bearing (open circle) 
which do not have an angular momentum with respect to the axis—they are canceled by forces of 
constraint, here by Fj. Equilibrium prevails, if the torque due to the force F and the counter-force 
G cancel each other, as indicated by the two hyperbola branches 


M=RxF. 


Since g; for a rigid lever is the same everywhere, equilibrium prevails here if 
the sum of the torques vanishes. The principle of virtual work then implies the lever 
law 


XM =0, wih M;=R; xF;. 


The equilibrium of the lever depends upon the torques, i.e., the vector products of 
lever arm and force. 


2.1.4 General Coordinates and Forces 


This section and the next actually touch upon the subject of Lagrangian mechanics, 
wherein, by a clever choice of coordinates, problems can be made soluble which 
otherwise would be intractable. In the static case, many things are much simpler 
than for time-dependent phenomena. It is this that we want to exploit here, and then 
begin by solving several examples (Problems 2.7—2.10), in order to get used to these 
notions. The lever law introduces us to this way of thinking. 

Very often the solubility of a problem depends on a choice of coordinates which 
can lead to mathematical as well as physical simplifications. For example, for the two- 
body problem we employ center-of-mass and relative coordinates, because the forces 
only depend upon the relative coordinate. At best, we choose the coordinates such 
that each constraint removes a variable and hence only the remaining ones survive as 
variables, e.g., in the case of the lever, we use cylindrical coordinates, because then 
the forces of constraint determine R and z, and only the angular coordinate g can 
vary. Then for an N-body problem, we do not need the 3N coordinates of the real 
space, but only f < 3N coordinates in the configuration space. Here f is called the 
number of degrees of freedom of the mechanical problem. 


60 2 Classical Mechanics 


In most textbooks, generalized coordinates are denoted by gx. But here we shall 
adopt the notation x* used in relativity theory and lattice dynamics. This is explained 
in detail in Sects. 1.2.2—1.2.5. 

The variables 


xt =x*(t, ri, ..., ry), with ke{l,..., f} and f <3N, 


can be Cartesian coordinates, but also curvilinear (e.g., spherical or cylindrical) 
coordinates, or even oblique ones—for which Vx! is not perpendicular to Vx* and 
therefore we have to distinguish between x* and xx. 

The f generalized coordinates x* (in addition to parameters like t) will describe 
the given problem completely: 


r; =r,(t,x',...,xf), forallie{1,...,N}. 


Correspondingly, we have for the virtual displacements, keeping time fixed, 


and for the principle of virtual work, 


N f 
0= JF; ôr; =) k 5x‘ í 
i=l k=1 
with the generalized forces 
N 
or; ƏV 
m=) Foa or Fe=— ay 


i=l 


if the external forces can be associated with a potential energy. The notation Fy with 
lower index k corresponds to the convention of Sect. 1.2.2, while the indices i are 
used here to count the particles. The x* do not need to be lengths and the F; are not 
necessarily forces in the usual sense, but F; 5x* has to be an energy. Thus, according 
to the last section, the generalized force “torque” corresponds to the generalized 
coordinate “angle”. 

In static equilibrium, all the F; are equal to zero if none of the x* depends upon the 
others. However, the constraints do not always have such simple properties: not every 
constraint fixes a coordinate and leaves the remaining ones undetermined. Therefore, 
we now want to treat a more general case. 


2.1 Basic Concepts 61 


2.1.5 Lagrangian Multipliers and Lagrange Equations 
of the First Kind 


If, for the moment, we do not consider the 3N — f constraints for N point masses, 
then in addition to the f generalized coordinates x* introduced so far, 3N — f further 
coordinates x“ (with k € {f +1,...,3N}) are still required, and these are in fact 
determined by the constraints. We assume that these constraints are given in the form 
of equations: 


®,(t, x',...,x°%)=0, forallne {1,...,3N — f}. 


Here, equations for differential forms suffice, because only the following 3N — f 
equations have to be valid for arbitrary parameter variations at fixed time: 


3N 
oe 4 » 5 aon ôx“ =0, with or =0, 


k=l pera 


where the coefficients do not need to be differential quotients—this becomes nec- 
essary, when we trace the forces back to a potential energy. Now we want to make 
use of the fact that only the f variations 6x* are free, but the remaining ôx“ depend 
upon the former and require the 3N — f Lagrangian (undetermined) multipliers Àn 
(one Lagrangian multiplier for each constraint) to satisfy 


3N—f N 
3P, 9 or; 
J À =—F;, with F; = y F; - ; 
PET: r 3x“ 5 A A ! ax“ 


This is an inhomogeneous linear system of 3N — f equations with the same number 
k of dependent variables. Once we have determined all Lagrangian multipliers Àn 
from this, then the relation 


i a®, 
2 F: bx ae ae ò 


xt, at 8 =0, 


implies the following expression for the principle of virtual work (with ôt = 0): 


f 3N-f a®,, 


N 
D òr; = Daa ir, öx“ = (F+ hap) Ox 


k=f+1 k=1 n=1 


This has to vanish for arbitrary 5x*. The bracket has to be zero for all f values k. 
Here, the Lagrangian multipliers have to be chosen such that the bracket vanishes also 
for the remaining 3N — f values «. We thus have generally for all / € {1,..., 3N}, 


62 2 Classical Mechanics 


a®, Y ôr 
Fit Do dn ap =O, with R=} Fo at or =0. 
i=1 


These are essentially the Lagrange equations of the first kind. For time-dependent 
problems, only the inertial forces are missing, and this will be treated later, in the 
context of d’ Alembert’s principle (Sect. 2.3.1). 

We consider a plane problem as an example. Let z = 0 be given. Then we 
can leave out the z-coordinate right away or, using the position vector r in three- 
dimensional space, calculate with the constraint ® = z = 0. With the coordinates 
(x!, x?, x?) = (x, y, z), we have 0 = F; = Fy = F; + 4 in equilibrium. Here, the 
Lagrangian multiplier à is equal to the force of constraint — F3, while the forces in 
the plane have to vanish. (Further examples can be found in Problems 2.7—2.10.) 

Since we could also have required 4,, ®,, = 0 instead of the constraint ®, = 0, 
only the product 4, ®, has a physical meaning, but not the Lagrangian multiplier A, 
itself. 

If the external forces can be derived from a potential energy, then for all / € 
1,...,3N in equilibrium, we also have 


3N-f 


ƏV ð 

h=- ad —(V— Ð ano) =0. 
l Jx! an ax! 2 n *n 

Consequently, the forces of constraint are obsolete, if we subtract the constraints 

with suitable Lagrangian multipliers from the potential energy. 


2.1.6 The Kepler Problem 


The three laws due to Kepler! lead uniquely to the acceleration 


X r ‘ 20 m? 
r=-C—, with C = 1.33 x 10" —, 
r s 

as we will prove immediately. It is more usual to start with the gravitational law 
and deduce Kepler’s laws, something we shall do only afterwards. It is customary 
to infer the possible types of motion from a given coupling. But to begin with, we 
solve here the so-called inverse problem, that is, we infer the coupling from the 
observed motion, just as one derives the interaction from scattering experiments. In 
this context, Lenz’s vector results as a conserved quantity, something that is not easy 
to explain with the usual procedure. 


‘Johannes Kepler (1571-1630), among other things, imperial mathematician and astronomer in 
Prague and then Professor in Linz (Austria). 


2.1 Basic Concepts 63 


Fig. 2.3 An ellipse with eccentricity £ = 2/3, its left focus (filled circle), the distance n of the 
ellipse from the focus, perpendicular to the main axis, a ray r, and the vector ae to the center, 
where a is the semi-major axis. The straight dotted lines have length a and b, and a? = b? + a’¢?, 
according to Pythagoras. The apex P is called the perihelion and A the aphelion (from the Greek 
helios for the Sun) 


According to Kepler’s first law, each planet moves along an ellipse with the Sun 
at the focus. 


Both celestial objects will be treated as point-like. 

Consider an ellipse with semi-major axis a. For each point on the ellipse specified 
by the vector r with origin at one of the foci, the sum of the distances from the foci, 
viz., 2a =r + 7(r — 2ae) - (r — 2ae) is fixed. Here, ae is the vector from one 
of the foci to the center of the ellipse, as shown in Fig. 2.3. Hence it follows that 
(2a — r} =r? —4ae-r+4a’e’, and we have 


r—-e-r=a(l1—-sJ=n, 


where 77 is the distance of the ellipse from the focus, measured perpendicular to the 
symmetry axis, i.e., at r L e. This is the starting equation for what follows. The 
number € is the eccentricity of the ellipse. The vector e is the Lenz vector which will 
be important later on because it is a characteristic of the orbit, hence a constant of 
the motion. (The vector A = —m?C e is often taken as the Lenz vector.) The area of 
the ellipse is A = mab = ma’¥/1 — £?, something we shall need for Kepler’s third 
law. 

Note that our starting equation has not yet fixed a plane orbit, if we take r as a 
vector in three dimensions. The plane orbit is required by Kepler’s second law (in 
vector form). In addition, the equation for fixed n > 0 comprises further plane orbits: 


é€=0: circle, O<e<tI: ellipse, 
€= 1: parabola, € > 1: hyperbola branch . 


If 7 is negative, then the branch is described from the other focus, but for € < 1 there 
is no longer a real solution. Still, we would like to allow for the generalization to 
£ > 1. In this way, we include orbits of meteorites, but also the motion of electrical 
point charges in the Coulomb field of other point charges. 


64 2 Classical Mechanics 


Fig. 2.4 The triangle spanned by r and dr has area dA = 5 |r x dr | (see Fig. 1.2). For the area— 
velocity law, we use dr = ï dt 


Differentiating r? = r - r with respect to time, the starting equation yields 


(As an aside, note that 7? is not equal to È - +, as we can see immediately from a 
circular orbit with 7 = 0, but r 4 0.) Thus, r is perpendicular to r/r — e. Here we 
have 


dr t or, r(@-r)-r(-r) (&xr)xr 
2 3 = 3 : 


dtr r r r r 


and therefore a further differentiation with respect to time yields 


< -t ai )-#= (rxr)-(rxr) 


r? 


as a further consequence of Kepler’s first law. This equation for F makes a statement 
about the normal acceleration, since (r/r — e ) is perpendicular to r. 


According to Kepler’s second law, the ray r traces equal areas dA = ilr x rdt| 
in equal times dt. 


This is also called the area—velocity law (see Fig. 2.4). Here, r and r always span the 
same plane. Consequently, the product r x r is constant: 


rxr=c => rxr=0 => f= fvr. 


(Later on, we shall introduce the momentum mr and the orbital angular momentum 
L=r x mr, where m is the reduced mass, explained in more detail in Sect. 2.2.2. 
The angular momentum is a constant of the motion in the non-relativistic context: 
according to the area—velocity law, the orbital angular momentum is conserved.) 
Using the above-mentioned relation here, we obtain 


2 c 


-Z =f@r-e =f = fm=—-+5. 
r nr 


The acceleration is always oriented towards the focus for n > 0 (away from the focus 
for n < 0) and decreases as r~? with distance r. 


2.1 Basic Concepts 65 


Fig. 2.5 For the Kepler problem, the velocity t traces a circle about the center —¢ x e/7 with radius 
c/n if n > 0 (for e > 1, it is only a section of a circle, because then r traces only one hyperbola 
branch). At perihelion (P), the speed is greatest, and at aphelion it is smallest, with the ratio equal 
to (l+e) : (1—e€) 


The orbit runs perpendicular to c = r x r. Therefore, the velocity t, which has 
to be perpendicular to both ¢ and r/r — e, also satisfies t x e x (r/r — e ). The 
missing factor follows from c = r x r, because (r x t) x (r/r — e ) is equal to 


rr-(r/r—e)—-rr-(r/r—e)=r(r—-r-e)=rn, 


so 


. r 
nt=ex(-—e). 
r 


Since ¢ is perpendicular to r, all vectors ¢ x r/r have the fixed length c. Therefore, 
r describes a circle about the center —e x e/n with radius c/n (see Fig. 2.5). 

Since e and r are perpendicular to c, the last equation delivers ¢ x È n/c? = 

e—r/ror 

r 

-+ = cxr=e. 

r c 
Thus, the left vector is a constant of the motion (namely, Lenz’s vector), asis r x t = 
c. 

The two Kepler laws discussed so far can be derived only for pure two-body 
problems. However, other planets (and moons) perturb, so those laws are valid only 
approximately, as we shall see in Sect. 2.2.6. There we shall also see that, for Kepler’s 
third law, the mass of the planet has to be negligible compared to the mass of the 


Sun. With Kepler’s third law the properties of different planets can be compared with 
each other: 


The cubes of the semi-major axes a of all planets behave like the squares of the 
periods T. 
Indeed, 


C m? 
ae 2 . 5 20 
= On)? T°, with C=1.33 x 10 are 


66 2 Classical Mechanics 


According to the second law, the area A = zra*/1 — e? of the ellipse (see p. 63) is 
equal to cT and thus T = 27a?./1 — e? /c, so we have 


CHa T/r = e Ha- = en. 


The abbreviation n introduced above may therefore be replaced by c?/C: 


g . C r 2 r 
r-e-r=—-, PÈ —=ex(--e), Pa =6 35 
C r 3 


c? r 
. r cxt : 
c=rxr, e= -+ ; with c-e = 0. 
r C 


Thus, the Kepler problem is uniquely characterized by the two fixed vectors ¢ and e€ 
and the constant C (or a), where c and e are perpendicular to each other. This gives 
6 independent parameters: three Euler angles fix the orbital plane and the direction 
of the major axis, while two further parameters determine the lengths of the axes and 
the sixth the period. 

If conversely we would like to infer the orbit from the acceleration # = —C r/r°, 
then Kepler’s second law follows immediately from r || —r. Therefore, we may 
introduce the vector ec = r x t as a constant of the motion. It is perpendicular to the 
orbit, i.e., to r andr. A further constant of the motion follows from 


d/r cxt (rxr)xr exF cxr exr 
( F )= 3 T SA z =0, 
dt \r C r C p- r- 
namely Lenz’s vector 
r cxt 
e=- 
r C 


This can be solved for r, because c - r = 0, and we can also take the scalar product 
with r: 


. C r c? 
t= sex (--e) and r-e=r—-—. 
c? r C 
From this, we obtain (for C > 0) the elliptical orbit with the focus as the origin 
(Kepler’s first law). We can thus derive all Kepler’s laws from the single equation 
r=-Cr/ r?°, because the third follows from the other two if C is the same for all 
planets. Instead of c?/C, we have used the geometric quantity n above. 
Since e = r/r +e x t n/c? and c-t = 0, we obtain the relation £? = e-¢ = 
1 —2n/r+4-4r n7/c? for the square of the Lenz vector e, and since c”/n = C, the 
square of the velocity is given by 


v =i- t=2C/r — (1 -—£°) C/n. 


2.1 Basic Concepts 67 


Fig. 2.6 The hyperbola branch r — e - r = n for eccentricity € = 3/2 (red) with the two foci (full 
circle, n > 0, attractive force and open circle, n < 0, repulsive force) and the asymptotes (dashed 
blue lines) in the directions to the initial and final points. In addition to the ray r , the vector ae to 
the center and the length ņ can be seen as in Fig. 2.3. The turning point is at a distance a from the 
center. In addition, the scattering angle @ and the collision parameter s are shown 


This relation can also be derived from the conservation of energy (p. 78), because 
mv? is the kinetic energy and —mC /r the potential. 

For a circular orbit about the zero point, the two foci coincide and r-r = 0. 
For constant orbital angular momentum, r x r = ¢ is also conserved. Then, with 


@=c/r’, 
r=oxr. 


We shall encounter this differential equation repeatedly. With r (0) L æ, it is solved 
by r (t) = r (0) cos(@t) + œ~ læ x r (0) sin(œwt). Note that, if r (0) also has a com- 
ponent in the direction of w, then this is conserved. 

Let us thus look at the hyperbolic orbit (with € > 1) (see Fig. 2.6). The directions 
of their asymptotes are determined by r — e - r = 0 and £ cos g = 1, where ¢ is half 
the opening angle. It is convenient here to define the scattering angle 0 = x — 29 
and obtain sin 50 =e! and cot 50 = Ve — 1. This can be expressed in terms of 
Vo, because væ” = (£? — 1)(C/c)* and thus cot 50 = Væ C/C. If we then introduce 
the collision parameter s (distance of the asymptotes from the foci) with c = s Vo, 
we obtain cot 50 = $V97/C. 

This result is useful for the Rutherford cross-section, which describes the angu- 
lar distribution for the elastic scattering of point charges q by a point charge g’— 
whatever enters the circular ring 27x s ds is scattered into the cone opening 27 sin 6 d0 
(shown on the left of Fig. 5.5): 


do 
dQ 


2n sds 5 ds? C2 |dcot? 50 
27 sin@d@| |dcos@)  2vo4t | dcosé@ 


Since cos @ = cos? 50 — sin? 50 = 2 cos? +0 — land 


68 2 Classical Mechanics 


dco? i0 1 d cos? 50 1 
dcos@ 2 dcos? 40 1— cos? 190 2sinf Lo i 
we obtain 
do C? 1 4 C? 


d2 4 (vy esin$6)4 |v- v't” 


where v is the initial velocity and v’ the final velocity. Here, C = qq'/(47 £o m) with 
the reduced mass m, explained in more detail in Sect. 2.2.2, and the electric field 
constant £9, if the charges q and q’ are given in coulomb (see p. 165 ff.). This is also 
obtained in (non-relativistic) quantum mechanics, as will be shown in Sect. 5.2.3. 
The scattering cross-section integrated over all directions & diverges, because the 
Coulomb force extends too far out. In reality, it will be screened by further charges. 


2.1.7 Summary: Basic Concepts 


We set the notion of force F as a basic ingredient of mechanics. The next section will 
be concerned with a different possibility, and we shall thus derive further quantities. 
In particular, a force can do work f F - dr along a path. If this work depends only 
upon the initial and final point of the path and not upon the path in-between, then we 
may set F = —VV and work with the simpler, scalar potential energy V. 

According to Newton’s third law, two bodies act on each other via equal but 
oppositely directed forces, which are not necessarily central forces. 

A special kind of forces are the forces of constraint. They originate from geo- 
metric constraints and do no work. Therefore, they do not need to be accounted for 
as forces due to virtual displacements—instead, the geometric constraint has to be 
obeyed for all displacements dr. If we can write the constraint as an equation ® = 0, 
then it can be accounted for by a Lagrangian parameter for the potential energy: 
(V — à ®) = 0 in equilibrium. 

In addition to these notions, decisive for statics, we have also treated the Kepler 
problem as an example for kinematics. From Kepler’s first and second laws (dating to 
1609), we could infer # x —r/r?. The missing factor here is the same for all planets, 
according to Kepler’s third law (dating to 1619). With these laws, their motions can 
be described by a single differential equation—the different orbits follow from the 
corresponding initial conditions. 


2.2 Newtonian Mechanics 69 


2.2 Newtonian Mechanics 


2.2.1 Force-Free Motion 


Newton? took the inertial law due to Galileo (1564-1642) as his first axiom in 1687: 


If no force acts on it (also no frictional force), a body remains in its state of rest 
or of uniform rectilinear motion—it is inertial. 


Here uniform rectilinear motion and the state of rest are equivalent. Different points 
of view are permitted, at rest and moving, as long as they are not accelerated relative 
to one other. Such allowable reference frames will be called inertial frames. In these 
frames, force-free bodies obey the inertial law. In contrast, bodies on curved orbits are 
always accelerated, according to Sect. 1.1.3. As a measure for uniform rectilinear 
motion, it is natural to think of the velocity. But Newton introduced instead the 
momentum 


as motional quantity. This is the velocity weighted by the inertial mass m. We shall 
encounter the notion of inertial mass in the context of the scattering laws in Sect. 2.2.3. 
For the moment it is sufficient for our purposes to note that each invariable body has 
a fixed mass, which depends neither upon time nor upon the position or the velocity, 
and is therefore a conserved quantity: a quantity is said to be conserved if it does not 
change with time. (Burning rockets and growing avalanches are “variable bodies”, 
whose mass does not remain constant in time.) Therefore, the inertial law may also 
be called the momentum conservation law (often called momentum conservation for 
short): 


a =p=0, for force-free motion. 


If no force acts, the momentum is conserved (inertial law, law of persistence). 
According to the theory of special relativity (Sect. 3.4), a body cannot move faster 
than the speed of light (c = 299 792 458 m/s), and therefore one actually has to set 


1 


Mh 
=y —, wl = ; 
j j V1 — (dr/dt)?/c? 


dt 


We then have p = m y dr/dt. The factor y is notably different from 1 only for 
v 7X c, as is clear from Fig. 3.23. Therefore, the simple non-relativistic calculation 


?Isaac Newton (1643-1727) was professor in Cambridge from 1669-1701, Master of the Royal 
Mint in London in 1699, and President of the Royal Society of London in 1703. 


70 2 Classical Mechanics 


fully suffices for many applications. To a similarly good approximation, “fixed” stars 
always remain at the same position and deliver a generalized reference frame. 

As long as m does not depend on time and we consider force-free motion, then in 
addition to the polar vector p, a scalar and an axial vector remain conserved, namely, 
the kinetic energy T and the angular momentum L: 


1 
2m 


T= pep=svey and L=rxp, 
since for fixed p and m, the quantity T is also conserved—and for p = 0, we also 
have L=r x p= p x p/m = 0. Altogether then, we have 


T=0, p=0, and L=0, 


if no force acts. 

In what follows, it will be useful to view the kinetic energy T as a scalar field of 
the variables v. Hence, in the velocity space, we may also take the momentum p as 
the gradient of T (and use lower indices according to p. 35): 


oT 


=V,T and = x 
p Pk avi 


This will help us later in Lagrangian and Hamiltonian mechanics, but also for the 
separation into center-of-mass and relative motion. 


2.2.2 Center-of-Mass Theorem 


We have just introduced the mass m as a constant factor in p = mv. It has not yet 
been explained why we need the momentum at all in addition to the velocity. This 
changes only when there are several masses mı, m2, ..., for the above-mentioned 
laws are valid not only for a single body, but also for several bodies, which normally 
act on each other and thus exert forces—as long as there are no external forces acting 
on the bodies. According to Newton’s third law (force equal to counter-force), the 
forces between the bodies cancel each other. Therefore, without external forces there 
is also no force on the system as a whole. This system we can treat as a single body. 
Its momentum is composed of the individual momenta and is conserved: 


P= 5 Pi. P=0 , if no external forces act. 


The masses thus weight the individual velocities. Hence, for two bodies without 
external forces, we have p; + p, = 0, but p; = —p, Æ 0 if they act on each other. 


2.2 Newtonian Mechanics 71 


If we introduce the total mass M and the position R of the center of mass, 


1 


M= mi, R=— Mi Yi, 
2 L D L L 


i 


then for F = 0, it moves with the constant velocity 


The total momentum is thus equal to the momentum of the center of mass. It is 
conserved if there are no external forces (center-of-mass law)—and hence according 
to the last section, the kinetic energy and the angular momentum of the center of 
mass remain conserved. 

For many-body problems it is helpful to introduce center-of-mass and relative 
vectors instead of the position vectors r;. We shall show this for the case of two point 
masses with M = mı + m2: 


mı rı + m2r2 
M 


R 


and r=rm-r,. 


(For more point masses, we must proceed stepwise. After the two-body problem, the 
third is to be treated with respect to the center of mass of the first two, and so on. 
This leads to the Jacobi coordinates. In view of this, we thus take r2 — rı and not 
rı — r, as the relative vector.) For this, it is convenient to write 


R _ (fm / Mm 2 / M rı 

r) \ -1 1 r2) ` 
The determinant of the matrix here is equal to 1 and thus the map is area-preserving 
(see Fig. 2.7). (In more than two dimensions, the corresponding volume should 


remain conserved, whence the functional determinant has also to be equal to 1, as 
discussed further in Sect. 1.2.4.) Therefore, conversely, we have 


(S Taaa 


since the inverse of a 2 x 2 matrix is 


ab\' 1 d —b 
cd} ~ ad—be \-c a }’ 


something we shall use repeatedly. 


72 2 Classical Mechanics 


Fig. 2.7 With the change from two-body to center-of-mass and relative coordinates (left) or vice 
versa (right), we make a transformation from rectangular to oblique coordinates which is not angle- 
preserving, shown here for the x-components and mı = 2m 2. The unit square (dashed lines) turns 
into a rhomboid of equal area (see Fig. 1.11). For mı = m2, we have a rectangle on the left and a 
rhombus on the right 


The same matrices appear for the transition (v1, V2) <> (V, v), because they 
remain conserved for the derivative with respect to time. Because v; = V — v m2/M 
and v2 = V + vm,/M, the kinetic energy is 


mvi? + mv? MV? + uv? : mm) 
= with u = 


T= , 
2 2 M 


Only this reduced mass u is important for relative motions. Hence, in T the mixed 
term V -v vanishes if we introduce a relative vector r œ r2 — rı in addition to 
the center-of-mass vector R. The center-of-mass and relative motion then already 
decouple. With r = rz — rı, we even obtain an area-preserving map (hence also 
rı x r2 = R x r), but not an angle-preserving map, because the matrices are not 
orthogonal. Since we have already made a transition from T (v1, v2) to T (V, Vv), we 
can also easily derive the momenta as gradients in velocity space: 


2 2 2 2 
P 
+7 


P=MV, p=uv => T= . 
2u 


= bm, | 2m 2M 


We already know the expression for P. Clearly, the two momenta P and p can be 
expressed as linear combinations of pı and py, viz.: 


p m2/M m,/M p2 , 
i) Gan 7) (S) 
p2 m2/M 1 p : 


noting that the momentum transformations are also area-preserving. In addition, we 
find for the angular momentum 


or 


L=r, x pitrxpo=RxP4+rxp. 


2.2 Newtonian Mechanics 73 


If no external forces act, the forces depend only upon r (and possibly upon v ), and 
we only need to deal with the relative motion. With this, the two-body problem is 
reduced to a single-body problem and has become essentially easier. 

The center-of-mass frame stands out here: if we choose the center of mass as 
origin, then P = 0 and consequently, p2 = —p; = p. 


2.2.3 Collision Laws 


If two bodies collide without external forces acting, then the relative motion changes, 
but not the motion of the center of mass: P’ = P. (Primed quantities will be used to 
describe the final state.) As far as the relative motion is concerned we need further 
information. In the following we consider only the motion before and after the colli- 
sion, not during the collision—therefore we do not care about the forces between the 
collision partners. These are necessary, however, if we need to determine the scatter- 
ing angle. In genuine scattering theory (see, e.g., Sects. 5.1 and 5.2), the interaction 
between the partners is indispensable. 

In addition to elastic scattering, we need also to deal with inelastic processes, but 
without exchange of mass, i.e., the collision partners keep their masses, but during 
the collision, their relative motion could possibly lose energy which is converted 
into work of deformation, rotational energy, or heat. (With exchange of mass the 
equations become less clear, but in principle, the situation is no more difficult to 
treat.) Here, we introduce the heat tone Q = (p'? — p”)/2u4—for elastic scattering 
p' = p and hence Q = 0. In contrast, for completely inelastic scattering, we have 
p’ = 0, and thus Q = —p*/2w. The ratio p’/p is abbreviated to 


1 
E= E, with p =yp+2uQ. 
P 


For elastic scattering € = 1 and for completely inelastic scattering § = 0. 

The relative momenta p’ and p may have different moduli and also different direc- 
tions. Therefore, we set p' = E D p with the rotation operator D given in Sect. 1.2.1. 
Then with P’ = P, according to the last section, we obtain 


Pi t mı + ém D m —ém D pı 
P% M \m— gm D m+§gm D) \po) ` 
For a completely inelastic collision ( = 0), we thus have v| = v} = V. 
Rather simple situations also occur for collisions between two mass points, 
because they take place only for r = 0. In this case the conservation of angular 


momentum leads to D = —1. If we consider here an elastic collision, then € D = —1 
and hence, 


74 2 Classical Mechanics 


(H) = ae (ana mam) C) 

P) M 2m m -—-miıj] (P) ` 

In the special case where m2 = my, it follows that p} = p2 and ph = pı: for equal 
masses the momenta (velocities) are exchanged. In contrast, for mz « mj, it fol- 
lows that p} © pı + 2u v2 and ph © 2uv;ı — po with u ~ mo: only the small mass 
significantly changes its velocity. 

Let us return to the collision of extended particles, but choose pı = 0 and in this 
“laboratory frame” derive p}, and phy —here and in the following we indicate clearly 
whether the quantity refers to the laboratory frame (L) or the center-of-mass frame 
(S). However, this is unnecessary for P and p: since Ps = 0, the total momentum 
P should always refer to the laboratory frame, and the relative momentum p does 
not depend on the reference frame. According to the last section, p = pos = —Ppis, 
and for pi, = 0, we have P = po, and also p = (mı /M) py, as well as p’/2u = 
(mı/M) Tz. This can be used to determine the parameter &: 


F mo [mo 
s= i+ a -Jt e. 


Since pı = 0 and pa, = (M /mı) p, we now have 


d i m 
pu’ =(1-€D)p and pau =( EDP: 
1 


The scattering normal n = p x p’/|p x p’| in Fig. 2.8 points into the viewing 
direction, so the vector n x p points to the right. Therefore, in the center-of-mass 
frame, as a function of the scattering angle (0s), the rotated vector Dp can be 


Fig. 2.8 A mass (open circle) collides with a mass twice as heavy (closed circle) at rest. The 
momenta before (/eft) and after (right) the collision are indicated in the laboratory frame (top) and 
the centre-of-mass frame (bottom). In the latter, only the relative momenta p and p’ before and 
after are shown, as the total momentum P is conserved. Here an elastic collision was assumed, 
whence the dashed lines around the full circle form a rhombus and are in the ratio m2 : m1. The 
large rhombus angle is clearly equal to x — 6s and twice as large as 6... For equal masses the two 
objects fly off from each other at right angles 


2.2 Newtonian Mechanics 75 


expanded in terms of p and n x p: Dp = cos 0s p + sin 0s n x p. We then obtain 
for the recoil momentum pj}, and for the momentum of the colliding particle p‘, , 


Pi’ =( 1 —€écosés)p — &sindsnxp, 
m 

Par’ = (= + Ẹ cos 0s) p + Esindsnxp. 
1 


Hence, noting that we should always have 0 < 0 < z, and in addition that tan 6;, = 
\pic’ x p|/(pi, -r) and (n x r) x p = —n, the scattering angle in the laboratory 
frame of the scattering particle (the one that is impinged upon) is given by 


sin s 


tan ĝi = ————"—__ , 
E E-! — cos ôs 


while for the scattered (impinging) particle, 


sin Os y m2 
tan h = ————_., with ¢ = ; 
¢ + cos Os Em, 


From this we can conclude that, for elastic collision (€ = 1), as in Fig. 2.8, @;, = 
3 (a — 0s) and for ¢ = 1, Ay = 56s, so that for equal masses (and elastic collision) 
On + Oo, = 4r. 

For a given 0s, there is a value 6), and a value of 62L, and for E < 1, Ai, < ir. 
In most cases, we do not consider the recoil, and instead of 62, we simply write 0L. 
Since sin Os cos 9, = (¢ + cos Os) sin 6L, the relation between 0; and Os can thus be 
written as 


¢ sin 0L = sin (Os — 0L) —=>7 Os = O& + arcsin (¢ sin 6L) . 
This relation is shown in Fig. 2.9 for different values of ¢. For ¢ > 1, only values 


@ < arcsin ¢—! occur, and for each 6, below this bound, there are two values of 6s. 
For the moduli of the momenta, we find 


PiL = pV 1— 2£ cos Os + £? 


and 


Po, = p&V14 26 cosOs + £2. 


The recoil momentum p/; is equal to the momentum transfer in the laboratory or 
center-of-mass frame (|po’ — poi] = |P5s — P2s]). For an elastic collision, itis equal 
to 2p sin 4s. Since cos ôL = phy - P/(Ph p), we have 


76 2 Classical Mechanics 


Fig. 2.9 Relation between Os 
the scattering angles in 0 
180 
laboratory and 
center-of-mass coordinates 150° 
for ¢ = 1/2 and 3/4 (dashed i 
red), for ¢ = 1 (full z 
magenta), and for ¢ = 4/3 120 
and 2 (dotted blue) 
90° 
60° 
30° 
0° 
0° 30° 60° 90° 120° 150° 180° 0L 
¢ + cos Os 


cos ÎL = ; 
L V1 + 26 cos ôs +ç? 


Hence one obtains the ratio dQ, /dQs = |d cos 0L /d cos s|, namely 


dL |1 +¢ cos@s| 
d2s JIE cos6s +e 


whence the scattering cross-sections can be converted from the laboratory to the 
center-of-mass frame (or vice versa). 

In principle the mass ratio can be determined from the velocities before and 
after the collision, even if it is inelastic, whence € = |v} — v1 |/v2 Æ 1: for a central 
collision (0s = 7) and since v5 : v2 = (mz — Em) : (m2 + mı), we have 


m, v+ V5 
mı V2 — V5 
For all other collisions, the momenta perpendicular to the original one cancel each 


other (momentum conservation): m2 : mı = v| ı : v5,. Further supplements can be 
found in Problems 2.15-2.18. 


2.2.4 Newton’s Second Law 


Newton’s law of motion is understood as his second axiom: 
Each force F on a freely mobile body changes its momentum according to 


F=p. 


2.2 Newtonian Mechanics 77 


The inertial law, referring to the case F = 0, seems to be a special case. But this was 
taken as defining an inertial system, because only then can the mass and momentum 
be introduced as observables. Since dp = F dr, the force often appears in an integral 
over F dt, which is referred to as the impulse (impulsive force). In p = mr, we can 
often take the mass as constant, whereupon 


F=mt. 


In relativistic dynamics, the factor y = 1/,/1 — v? /c? also enters our considerations, 
because we must refer to the proper time, as will be shown in Sect. 3.4.10. 

The equation F = p can be applied to rotational motion. Since r || p, it is clear 
that d (r x p)/dt = r x p = r x F = M, and since r x p = L, we conclude that 


M =L. 


A torque on a mobile body changes its angular momentum. 
For an invariable mass, the law of motion delivers a differential equation of second 
order: 


. F(t, r, r) 
r= ————_. 


m 


This differential equation has to be integrated, because we are interested in the orbit, 
and from r (t) we can derive the velocity. Then, for each integration an integration 
constant occurs (here, actually an integration vector). The law of motion leaves 
us with the choice of initial position and velocity, so the general solution r of the 
differential equation depends upon ż, ro, and ro. These values have to determine the 
solution uniquely, otherwise the force is unphysical. 

If the force does not depend upon the velocity, but only on position r and possibly 
time t, we speak of a given force field. Since for a given force the acceleration r is 
inversely proportional to the mass, we consider the field F/m, and for curl-free force 
fields, the potential ® = V/m instead of the potential energy V. Only then is the 
force field independent of the test body, and we have 


V 
r=-VO, with ® = —, 
m 
if V x F = 0 and m =0. 


2.2.5 Conserved Quantities and Time Averages 


Ifa force acts, F Æ 0, then the momentum is no longer a conserved quantity because it 
changes with time. But let us consider also the two conserved quantities encountered 


78 2 Classical Mechanics 


so far, the kinetic energy and the angular momentum: what are their derivatives with 
respect to time when a force acts? If we assume a constant mass, we obtain 


dT l d p dp dr 

aE E E E a 

dt 2m dt m dt dt 

For a time-independent force, we thus find dT = F - dr. If moreover the force field 
is curl-free, then it can be derived from a potential energy V, and because dV = 
VV -dr = —F - dr, we clearly have dT = —dV. (If the force depends upon time, 
then neither dT = F - dr nor $ F - dr = 0 can be inferred, and it then depends on 
the time span over which the work is done.) Thus, there is a conservation law for the 
energy, Viz., 


E=T+V, 


if V (or the associated force F ) does not depend on time. 

In the following sections, we shall discuss several examples with curl-free forces, 
to which a potential can therefore be assigned. An important counter-example is 
provided by the Lorentz force 


F=q (E+vxB), 


which acts on an electric charge q in an electromagnetic field specified by E and B. 
The Maxwell equations V x E = —0B/dt and V - B = 0 imply that F has the curl 
density 


3B ; 
y x F= -q (= +(-v)B) = gb, 
since r and v have to be treated as mutually independent variables, whereupon 
V x(vxB)=v V.B- (v. V)B. Even if the magnetic field does not depend on 
position (only on time), the force field has curls. Then there is no potential energy, 
unless we introduce a generalized potential energy as in Sect. 2.3.4. In any case, here 
(with E = 0 and constant mass) the equation of motion is 


The value of is the cyclotron frequency. (Note that we encountered a similar 
differential equation for the circular orbit, but for r rather than v, on p. 67.) Even 
though a force acts here, the kinetic energy is still conserved because the Lorentz 
force is always perpendicular to v and thus does not change v. Therefore, if we 
set v = v er with fixed v, it follows that èr = œ x er, or der/ds = v™! w x er. We 
have already met this differential equation on p. 8. Quite generally, the charge moves 
on a helical orbit in the homogeneous magnetic field, with fixed Darboux vector 
Ter + Keg = @/V. 


2.2 Newtonian Mechanics 79 


The other conserved quantity introduced so far, the angular momentum L = r x p, 
is only a conserved quantity if the torque M vanishes, i.e., if F is a central force, 
since dL/dt = M according to the last section. This is the case, e.g., if the potential 
has spherical symmetry: 


dọ r dL 
Dm=) = Vo=—* = 


=> —=0. 
dr r dt 


Note that here only the angular momentum with respect to the symmetry center is 
conserved. It is only if no force acts at all that it is conserved with respect to any 
point. For cylindrical symmetry, thus if ® does not depend upon the angle coordinate 
ọ, at least the angular momentum component along the symmetry axis is conserved. 

Of course, mean values taken over time are also conserved. This is important for 
the virial theorem, which says that, if r and p always stay finite (and the mass always 
the same), then for the time-averaged value of the virial r - F, 


r-F=-27. 


Hence, if r and p always stay finite, then so does the auxiliary quantity G(t) = r - p. 
For sufficiently long times t, the quantity t~! {G(t) — G(0)} thus vanishes, and this 
is the mean value of G = v - p+r-F=2T +r -F between 0 and t. For example, 
for a central force F = cr” r/r, the virial is equal to cr"*!, and hence, according 
to p. 57, it is equal to —(n+1) V. This theorem leads here to T= $(n+1) V. In 
particular, for a harmonic oscillation, we have n = 1 and thus T = V, while for the 
gravitational and Coulomb forces n = —2, and thus T= —3V. The virial theorem 
must not be applied to the hyperbolic orbit, because r - p does not remain finite. 


2.2.6 Planetary Motion as a Two-Body Problem, 
and Gravitational Force 


If there are no external forces, the total momentum is conserved. Then we are con- 
cerned only with the relative motion. An important example is application to the 
Sun-Earth system, which may be viewed approximately as a two-body problem, 
although the Moon and the other planets should be accounted for in a more accurate 
solution. 

Here gravity acts, that is, the force between massive bodies. So far, the term “mass” 
has always been understood as inertial mass. But in fact, the active gravitational mass 
mı exerts a force 


mm, r2 —rı 


F2; = G 2 
ro = rı|4 [Fo — rı| 


on the passive gravitational mass m2, where G is the gravitational constant (see 
p. 623). But from experience, we may assume the active and passive gravitational 


80 2 Classical Mechanics 


masses and the inertial mass to be the same, at least to an accuracy of one part in 
10!!)—this is the basis of general relativity theory. 

Exactly as the Sun (S) attracts the Earth (E) and hence exerts the force Fgs, the 
Earth attracts the Sun with the opposite force Fsg according to Newton’s third law. 
Hence we infer 


Pe = Fes = —Fsg = —Ps , 
and P = Ps + pg = 9 for the center of mass. Then, according to p. 72, 


Ms Pe — Mg Ps . mg ms r 

——— = þe = Fes = —G =e 
2 

Ms + mg r r 


p 


for the relative momentum. 

Once again, only the relative coordinate is of interest—no external force acts 
on the center of mass, as long as the influence of other celestial objects remains 
negligible. Since p = ut with u = ms mg/(ms + mẹ), 


a ms +mg r 
r=— ~a ee 
r r 


Therefore, the first two Kepler laws are valid not only with the Sun at the coordinate 
origin, but also for the relative motion. With the third law, however, we have 


a` ms + Mg 
=G , 
T? 4r? 


i.e., for every planet there is another “constant”, since a?/ T? = C /47? holds with 
¥ = —Cr/r?, according to p. 65. However, the mass ratio of planet to Sun is less than 
0.001 even for Jupiter. In addition, we have neglected the mutual attractions of the 
other planets and moons. This perturbation can be accounted for approximately. This 
is how, from the perturbed orbit of Uranus, Leverrier deduced the presence of the as 
yet unknown planet Neptune, a jewel of celestial mechanics. Incidentally, Kepler had 
already noticed that Jupiter and Saturn did not travel on purely elliptical orbits—these 
two neighboring planets are the heaviest in the Solar System and therefore perturb 
each other with a particularly strong force. Likewise, returning comets move on 
elliptical orbits about the Sun which are sensitive to perturbations (see Problem 2.11). 

The gravitational force acts “not only in heaven, but also on Earth”. All objects 
are pulled toward the Earth—they have weight. However, this notion is used with 
different meanings. In the international system (SI), the (gravitational) mass is under- 
stood, but in any everyday context, the associated gravitational force. If we buy | kg 
of flour, we actually want to have the associated mass, but when we weigh it, we 
use the force with which the Earth attracts this mass. Physicists should stick to the 
international system and also take “weight” as mass. 


2.2 Newtonian Mechanics 81 


2.2.7 Gravitational Acceleration 


According to the gravitational law, at its surface, the Earth exerts the gravitational 
force 


Gmg R 


F=mg with g=— R R’ 


on a body of mass m, if we assume a spherically symmetric Earth. Here the gravita- 
tional acceleration g is assumed constant, as long as the distance R from the center 
of the Earth changes only negligibly (since the Earth rotates about its axis, we should 
also take into account the position-dependent centrifugal force). The vector —R/R 
is a unit vector which, at the surface of the Earth, points “vertically downwards”. 
The gravitational acceleration g thus follows from the mass mg and radius R of the 
Earth and the gravitational constant G. 

According to this equation, the total mass mg can be considered as concentrated 
at the center of the Earth when evaluating the gravitational force on a test body near 
the surface of the Earth. For the proof, since a scalar field is easier to work with than 
the associated force field, we consider the gravitational potential 


G 
“es FM =-G TEE, 
r r 


P(r) = — 


which we derive for r > R from 


/ Bact 
on) =-6 [OO 


Ir=r'] 


Here we assume the density distribution to be spherically symmetric, i.e., o(r’) = 
p(r’), although it does not need to be homogeneous (actually, the Earth’s mantle has 
a lower density than the core). Thus let 


me = f oc’ aan | por'y rar", 


In order to evaluate the potential, we expand |r — r’|~! in powers of s = r'/r < 1 
and introduce the angle 6 between r’ and r: 


1 1 
Ir—r’| rJ/1—2s cos +s? 


1 oe) 
= > P,(cos@) s”. 
7 
n=0 


The expansion coefficients P,, (cos 0) are called Legendre polynomials. We shall meet 
them occasionally, e.g., in electrostatics (see p. 181) and in quantum theory with the 
spherical functions (see p. 334 ff). The first of these are (see Fig. 2.10) 


82 2 Classical Mechanics 


Fig. 2.10 Legendre 
polynomials P„ (z) with n 
from 0 to 5. Continuous 
curves: Even n. Dashed 
curves: Odd n. It can be 
proven recursively that 
P,(1) = 1 and 

Pa (=z) = (—)" Pa (z) for all 
n € {0, 1,2, ...} and that n 
gives the number of zeros of 
the given function 


PQ=1, A@=z, P@=382-1), 
The remaining ones can be obtained via the recursion relation 
(n+1) Pati(Z) — Èn +1)z Pa) +n Pai) = 0, 


which follows from the generating function (see the power series above) 


1 CO 
—— = P,a (z) s”, for |s|<1. 
~ 1-— 2sz + s? 3 


This means that, if we differentiate this equation with respect to s and then multiply it 
by 1 — 2sz + s?, we obtain (z — s) )>, P,(z) s” = (1 — 2sz +57) X, n P,(z) 8", 
and hence by comparing coefficients, the recursion relation is proven. 

In addition, the Legendre polynomials have the property (important for us) 


1 Ed 2 
f P, (z) Py (z) dz = f P, (cos 0) Py (cos) sin d@ = —— ônv , 
=i 0 2n + 1 


whence they form a (complete) orthogonal system for —1 < z < 1. This can be 
proven using the generating function of the Legendre polynomials. For |s| < 1 and 
|t| < 1, this delivers 
1 
V1 = 2sz +9? V1 = 2tz +t 


=} Pm (Z) PRO”. 


mn 


But now, if we cancel a factor of al dt +v 2s, 


2.2 Newtonian Mechanics 83 


f dz 
-1 V1 — 2sz + 92/1 — 2tz +P 


=- p m(x d = 282 $7) + VOB? | 
1 inv ts) + v250 +9) _ 1 int vs 


Vi VE —s)+J0—f) Vet 1—Jst 


Hence, since In(1 + x) = par x" /n for |x| < 1, it follows that 


+1 


1 dz lee) 2 
= (st)” , for |st| <1. 
a Lin +l 


Comparing coefficients proves the claim. 

Further properties of the Legendre polynomials are given on p. 334, and more can 
be found in, e.g., [1]. 

Since we started from a spherically symmetric density distribution p(r’) = 
p(r’) Po, after integrating over all directions, only the term with n = 0 remains: 


r/)dy' 2x2 
[4 ) r = X LT forar. 
r r 


r= r'| 


This means that we can perform calculations as though the mass of the Earth were 
concentrated at its center. (Problem 2.20 is also instructive here.) 


2.2.8 Free-Fall, Thrust, and Atmospheric Drag 


If we calculate with the same gravitational acceleration g everywhere on the surface 
of the Earth, then, according to Newton’s law of motion, we obtain 


=g, F=wt+gr, r=ro+wot+igt. 
According to pp. 57 and 77, a gravitational potential 
O(r)=—-g-r 


is associated with the constant acceleration. The gauge here is such that the potential 
vanishes at the surface of the Earth, where the coordinate origin is taken. If a body 
loses height h, its potential energy decreases by mgh. For free fall, the kinetic energy 
increases by this amount, so its velocity goes from zero to v = ./2gh. 

If the body is thrown through air instead of empty space, then it loses momentum 
to the air molecules it collides with. The number of collisions per unit time increases 


84 2 Classical Mechanics 


linearly with its velocity, and in each collision, it loses on average a fraction of 
its momentum determined by the mass ratio. Hence we have to set the frictional 
force proportional to —v v (Newtonian friction, not Stokes friction, which would be 
proportional to v, as, e.g., later on p. 99) and write with £ > 0, 


V=g-— p’ gvv. 


For objects surrounded by fluids one normally writes cw 5 pA v? for the frictional 
force, where cy is the drag coefficient, p the density of the medium (here the air), 
and A the cross-section of the body perpendicular to the air stream. Streamlined 
bodies have the smallest drag coefficient, namely 0.055. As far as the author is 
aware, the above non-linear differential equation can be solved in closed form only 
in one dimension. Therefore we assume that vo is parallel to the vertical and measure 
v in the direction of g. Then we have 


Voga-p => 


= =g dt. 
dr è 


dv 
1 = pv? 
After separation of variables, we can integrate and obtain 


_ 1 Bvo + tanh(Bgt) 
~ B 1+ Bro tanh(Bgr) ` 


Consequently, the velocity changes at first linearly with time, v © vo + (1 — B7v07) 
gt, and finally becomes constant (incidentally faster than the horizontal component 
of v which tends to zero): 


l (1 pM one T i ) 
ex TR A 
ia. Pe 
For x > 1, i.e., tanh x ~ —2 exp (—2x), and with b = vo and e = 2 exp(—2x), we 
have 
b+1l-e 1—e/(1 +b) 


1+b(1—e) 1-—be/( +b)’ 


and because |e| < 1, we may replace {1 — be/(1+)}~! approximately by 1 + 
be/( + b) and likewise {1 — e/(1 + b)H1 + be/(1 + b)} by 1 — e (1 — b)/(1 + b). 
The body is accelerated until its gravitational force and frictional force cancel 
each other. It then permanently loses potential energy without increasing its kinetic 
energy—the energy has now completely turned into frictional energy (heat). 

Note that a solution initially changing linearly in time and ending up exponentially 
approaching a constant velocity also occurs for free fall with Stokes’s friction, viz., 
Vv =g — av. Then we have v= a!g + (Vo —a~'g) exp (—at), where vo and g 
may span a plane. As can be seen from Fig. 2.11, it is better in any case to calculate 
with this approximation than to neglect friction completely. 


2.2 Newtonian Mechanics 85 


0.0 
0 1 2 3 4 Bt 0 1 2 3 4 Bgt 


Fig. 2.11 Free fall from rest with friction with the air. Continuous red curves: Newtonian friction. 
Dotted blue curves: Stokes’s friction (here, a = Bg). Dashed green curves: Without friction (with 
appropriate scaling v = gt and s = l gt?) 


2.2.9 Rigid Bodies 


The parts of a rigid body keep always their relative distances. Therefore, we shall 
refer the position vectors of the mass elements dm to a fixed point in the body, usually 
the center of mass R = f r dm/M: 


r’=r-—R => fe am=0. 


The vectors r’ have constant lengths, so we can infer the equations 


© eho 20 => gern 

dt E dt a 
Here @ is an axial vector in the direction of the axis of rotation (right-hand rule) 
and with the value of the angular velocity, as already introduced on p. 67 for circular 
motion. It describes the rotation of the rigid body and does not depend on the position 
r’. For alli and k, r; - r}, will not depend on time, whence t; - r, +r; - t, = (@ — 
æ) + (r; x r}) must always vanish, and œ; = @, must hold. 

From these considerations, we may generally decompose the motion of each point 

of the body into that of the reference point and a rotational motion: 


r=V+oxr’. 


For the total momentum, 


p= [idm=Mv+ox f r'am, 


86 2 Classical Mechanics 


where we may write p(r’) d7’ instead of dm. The last term vanishes because 
fx’ dm = 0. The expressions for the angular momentum and the kinetic energy 
(see p. 72) then simplify to (otherwise there would be further terms): 


L= frxiam=RxPs [r'x xe’ dm, 
Ta} fitdm= MV? +} o. 


Here Ie is the moment of inertia of the body with respect to the axis e, = w/a, 
which must go through the center of mass: 


l, = fe x r’)? dm = p — (es r)’} pr’) dr’. 


More precisely, we should write [,,cm, because the axis of rotation goes through 
the center of mass. For a rotation about the origin, e,, x r’ is to be replaced by 
eo x (R +r’), and therefore both moments of inertia differ by the non-negative 
quantity 


Io — Iocm = M (eo x R} , 


i.e., by the mass multiplied by the square of the distances of the center of mass from 
the axis of rotation. This is Steiner’s theorem. It is very helpful, because we may then 
choose the origin of our coordinate systems in a more convenient place. 


2.2.10 Moment of Inertia 


In general, the moment of inertia I, also depends upon the rotational orientation 
ew. This is what we shall investigate now. Here we let the center of mass remain 
at rest and thus take it as the reference point of the fixed body system. We shall 
write r instead of r’ as we have done so far. Then, because t = œ x r, we obtain the 
expression 


L= [rx(xram 


for the angular momentum of the rigid body, which is also important for the kinetic 
energy of the rotation (the rotational energy), since (@ x r) = (@ x r) - (@ x r) = 
æ - {r x (@ x r)} delivers 


2.2 Newtonian Mechanics 87 


corresponding to T = iv -p for rectilinear motion. Clearly, L and œ depend on each 
other linearly, but may have different directions. If we write 


L=/o, 


then J is a linear operator—more precisely a tensor of second rank, because it assigns 
a vector linearly to another vector. If we decompose 


L= [rx(oxrdm= [tor rio) dm 


in terms of Cartesian components, e.g., Ly = f wr? — x (xox + y@y + z@,) dm, 
we atrive at the system of linear equations 


Ly Tix Tey Ixz Wx 
Ly |= | hye by Gye | | o 
L, Tex Tey Tez w; 


with 


Lx = f 7? x°) dm = f O° +2°) px) d? , 
Iy =f (-xy) dm= f (-xy) p(r) dy = Tyx » 


and cyclic permutations thereof. The 3 x 3 matrix is symmetric and has thus only 
six (real) independent elements. The three on the diagonal are called the moments 
of inertia, the remaining ones (without minus signs) the deviation moments, i.e., 
deviation of the direction of L from the direction of w. 

In the next section it will turn out that, for a suitable choice of axes, all the 
deviation moments vanish. In addition to the three principal moments of inertia on 
the diagonal, three further parameters are then required to fix the orientation of the 
axes, e.g., the Euler angles. This transition to diagonal form is called the principal 
axis transformation. 


2.2.11 Principal Axis Transformation 


If J is diagonal, there are three eigenvectors u;, for which J u; is in the direction of 
u;, namely the three column vectors with two components equal to zero. Since J is a 
linear operator, the value of u; is of no interest here. We take unit vectors. The factors 
I; in the equation J u; = J; u; are called eigenvalues. If I is not diagonal, then we 
still have to rotate. Only DJ D7! can then be diagonal and correspondingly Du; is 
an eigenvector with two vanishing components. We therefore consider 


U-lbu=0, 


88 2 Classical Mechanics 


and determine J; and u; from this homogeneous linear system of equations. As is 
well known, it is only soluble if its determinant vanishes: 


det — 7; 1) =0. 


This characteristic equation, involving 3 x 3-matrices, leads to an equation of third 
order with three solutions [;, J), I, actually, to three such equations, viz., 0 = 
det(7 — J, 1) = det(7 — h 1) = det(/ — J, 1). These solutions are all real, because 
I is real and symmetric. They would still be real if Z were only Hermitian, i.e., if Z = 
I‘ & T= I*.Asforthe orthogonal transformations, we may write the eigenvectors 
u; as a column matrix U;, and for J; U; its three elements multiplied by the number 
I;. Therefore, 


I; Uj) U; = U;' (I Ui) = 0;' IÙ U; = (I Up)? U; = 1," Ut U. 


It follows that the eigenvalues are real, because we may set j = i and have U; U;, =1, 
and since (J; — 1;*) U;'U; = 0, it also follows that the eigenvectors corresponding 
to different eigenvalues J; # 1; are orthogonal to each other, since in the given case, 
we have U; = U;* = U;' = Uj. If, however, two eigenvalues are equal, the two 
eigenvectors need not be perpendicular to each other, but then any vector from the 
subspace spanned by the two eigenvectors is an eigenvector, so any pair of mutually 
orthogonal unit vectors may be chosen from this set: then all eigenvectors are pairwise 
orthogonal to each other. Since the diagonal elements of J are sums of squares, the 
eigenvalues here are not only real, but positive-definite, i.e., non-negative (positive 
or zero). 

When determining the principal moments of inertia, symmetry considerations are 
often helpful—then we can avoid the diagonalization of 7. Here, axial symmetry is 
not necessary. Reflection symmetry with respect to a plane suffices. Then from the 
distribution with p (x, y, z) = p (—x, y, z), symmetric in the yz-plane, it follows 
that J, = — f xyp dr =I yx as well as I,, = I,, vanish, whence Ty» is a principal 
moment of inertia. The normal to the mirror plane is a principal axis of the moment 
of inertia. 

For a plane mass distribution, the moment of inertia with respect to the normal to 
the plane is composed of the moments of inertia of the two mutually perpendicular 
axes in the plane. Hence, if we choose the x and y axes in the plane (z = 0), we find 
Ixy = f y? dm, Iyy = f x? dm, and I,, = Ixy + Iyy. See also Problems 2.24-2.26. 

Of course, we may order the eigenvectors so that they form a right-handed frame. 
Then with a rotation D we arrive at these new unit vectors and, as on p. 36, we may also 
set 1’ = DI D~!. The sum and product of the eigenvalues can thus be determined even 
without a principal axis transformation. Because tr(A B) = tr(BA) and det(AB) = 
det(BA) (and D~!D = 1), the trace and the determinant are conserved under the 
principal axis transformation. 


2.2 Newtonian Mechanics 89 


Fig. 2.12 Poinsot’s construction. The inertial ellipsoid rolls on the invariant plane, i.e., the plane 
tangential to the inertial ellipsoid at the contact point of the angular velocity œ. The angular momen- 
tum L is perpendicular to this plane. As an example, of a body with the inertial ellipsoid shown here 
(continuous curve) we take an appropriate cylinder (dotted line). The principal moment of inertia 
about the symmetry axis (dashed curve) is im R?, while the one perpendicular to it is zm (R? + zl 2) 


For an arbitrary axis of rotation œ, since œ = »; u; (u; - @), the moment of 
inertia is 


lo = eu: f tx (Go x £) dm = ew: 1 eo = YO 1 (Ws -eo 


t 


It can thus be evaluated rather easily from the principal moments of inertia. Hence, 
they only have to be weighted with the squares of the directional cosines of œ along 
the principal axes of inertia fixed in the body. 

When the principal moments of inertia are known, the equation 


T(@) = 5 lp @ =} (h a + h a” + hb w?) 


represents an ellipsoid in the variables w, with semi-axes ./27T/1;. This is the iner- 
tial ellipsoid. Clearly, OT /d@; = l; w; = L;, or in vector notation, just as we had 
V,T =p, 


VoT =L. 


For a given @, the angular momentum L is perpendicular to the tangential plane of the 
inertial ellipsoid at the point of contact of @ (Poinsot’s construction) (see Fig. 2.12). 
Conversely, for a given angular momentum, the rotation vector can be found at each 
time using the inertial ellipsoid. 

Ifno torque acts, then T = 5 w - Land L are constant, and so also is the projection 
of œ onto the the spatially fixed angular momentum. The point of contact of œ then 
moves on an invariant plane perpendicular to the angular momentum. The inertial 
ellipsoid rolls on this plane and the center of mass is at a constant distance from 
this plane. This motion is also called nutation (see Fig. 2.13). Instead of “nutation’”, 


90 2 Classical Mechanics 


Fig. 2.13 Nutation of the figure axis (dashed line) for an axially symmetric moment of inertia. Here 
the polhode cone (continuous curve) rolls on the herpolhode cone (dotted curve). As in Fig. 2.12, 
an elongated top is assumed here—otherwise the polhode cone does not roll outside the herpolhode 
cone, but rather inside it. Quantitatively, this rolling is described by the Euler equations (without 
torque) in Sect. 2.2.12 


regular precession is occasionally used, since for a precession the angular momentum 
changes because a torque acts. 

For an axially symmetric moment of inertia, w(t) generates the spatially fixed 
herpolhode cone on which the the body-fixed polhode cone rolls about the figure 
axis. For an axially symmetric moment of inertia, the rotation of the figure axis 
about the angular momentum axis degenerates to a nutation cone. 


2.2.12 Accelerated Reference Frames and Fictitious Forces 


So far the laws have been valid in arbitrary inertial systems. But in accelerated 
reference frames, “fictitious forces” also appear. We shall deal with those here. 

In a rectilinear accelerated (body-fixed) system with rg = rr — ry, the accelera- 
tion rx differs from that in the inertial system (Fr) by the acceleration of the origin, 
Fy. In particular, from mr = F, we have 


The last term is the additional inertial force in the accelerated system. 

But in a rotating reference frame, e.g., fixed in the Earth, according to our con- 
siderations about rigid bodies and for arbitrary vectors x (the origin of all position 
vectors being fixed), we have 


dx dx P dx dx 
= = — @x xX <> == = == —@X Xx. 
di J, \dt), S dt), \dt)» = 


In particular, Vk = Vr — œ x rx. Taking this as an operator equation 


2.2 Newtonian Mechanics 91 


Fig. 2.14 Coriolis force on the Earth. Our “laboratory” rotates eastwards (indicated by the arrow 
at the equator and the rotation vector at the north pole). Rectilinear motions are thus deflected: 
motions restricted to the horizontal are thus deflected to the right in the northern hemisphere and 
to the left in the southern hemisphere (see also Problem 2.29) 


d (àd p 
EF Sa ae a 


we can easily obtain the second derivative with respect to time: 


(a), (Eee), 
— eo = | — o +0 x o 
dt? Jr dt r 


d? d 
= (f etoxet20x Setox xo) 


2 
dr . 


Hence, ay = ax +@ x rK +2 x VK +ø x (w x rx), and the force equation is 
max =F-møxrg—2møxvg-møx (WX rx). 


The last term is the well-known centrifugal force. It points away from the axis of 
rotation. Ifr | is the part of rg perpendicular to w (measured from the axis of rotation), 
the centrifugal force is equal to mw*r_ . 

The term —2m @ x Vx is the Coriolis force, named after G.-G. Coriolis,? which 
occurs only for moving bodies and is formally similar to the Lorentz force —qB x v. 
On the Earth, it is weak compared to the attraction of the Earth. Therefore, we express 
the rotational vector @ in terms of the local unit vectors of the spherical coordinates 
(0, o) (see Fig. 1.12): œ = w (cos @ e, — sin 0 eg). The part 2w cos 0 v x e, deflects 
horizontal motions in the northern hemisphere (0 < 6 < ir) to the right, and in the 
southern hemisphere Gr < 0 < )to the left (see Fig. 2.14). Among other things, it 
rotates the oscillation plane of Foucault’s pendulum. The remainder 2w sin 0 eg x v 
is strongest at the equator and deflects uprising masses to the west, i.e., against the 
rotational orientation the Earth. 


3 Gustave-Gaspard Coriolis (1792-1843). 


92 2 Classical Mechanics 


The equation L = M, which is valid in the inertial system, is more complicated 
in the rotating system because dLy/dt = dLx/dt + @ x Lx, i.e., 


L=M-oxlL, 


where we now leave out the index K for L (the torque refers further on to the inertial 
system). On the other hand, the angular momentum and the rotational vector are 
related to each other in a simpler way, because in the inertial system the moment 
of inertia (of a rigid body) does not change with time. In particular, if we introduce 
Cartesian coordinates along the principal axes of the moment of inertia, such that 
Li = I; œw, then it follows that 


I, @ = Mı + @203 (h — h), and cyclic permutations. 


These are the Euler equations for the rigid body. We shall investigate these now for 
M = 0, namely for the free top, and deal with the heavy top (M 4 0) in Sect. 2.4.10. 

Since © = 0, the spherically symmetric top (with J; = h = B) always rotates 
about a fixed axis. With the axially symmetric top (J; = h 4 J), only the component 
along the symmetry axis is conserved (#3; = 0 = œz and L3 constant). With the 
fixed vector 


h-I 
Q = Š l w3 €3 , 
1 
and because © = -2w @ = Qa, @3 = 0, the Euler equations (for J; = I2) can 
be taken together as 
O=Qxøw. 


Thus the vector œ moves with angular frequency Q on a polhode cone about the 
body-fixed figure axis. The opening angle of the cone is determined by the integration 
constants, e.g., energy and value of the angular momentum. 

For a three-axis inertial ellipsoid (7 4 h # J), all three components of œ change 
in the course of time. Then Poinsot’s construction can lead us to the result. In any case, 
T= 5 æ - Lis a constant of the motion (if no torque acts) and therefore L = VT. 
Problem 2.28 will also be instructive here. 


2.2.13 Summary of Newtonian Mechanics 


Newton identified three basic laws for non-relativistic mechanics: the inertial law 
which says that force-free bodies move in a uniform rectilinear way or are at rest 
(this allows us to draw conclusions about mass ratios in collision processes), the 
equation p = F (where p is an abbreviation for mv), and the law of “action and 
reaction”. Without the action of a force, the momentum p is conserved—we only 


2.2 Newtonian Mechanics 93 


need to investigate those motions that are affected by forces. We have explained this 
in some detail for collisions and the motion of planets. Here the bodies were treated 
as point masses. We then also treated extended rigid bodies, describing their motion 
about the center of mass with the Euler equations. In accelerating reference frames, 
fictitious forces must also be accounted for, e.g., the centrifugal and Coriolis forces. 


2.3 Lagrangian Mechanics 


2.3.1 D’Alembert’s Principle 


We could have considered many more applications of Newtonian mechanics. Basi- 
cally, there will be no new physical effects in the next few sections. These will only 
appear in electrodynamics (relativity theory), quantum mechanics, and statistical 
mechanics. But with new notions and better mathematical methods, we can often 
simplify the workload and even obtain a complete mastery of it. In particular, we 
shall deal more easily with “geometric constraints” (forces of constraint)—this is 
accomplished by Lagrangian mechanics.* 

Here we generalize the notion of momentum and, in addition to the mechani- 
cal momentum mv considered so far, introduce also the canonical momentum p. 
Therefore, instead of the usual letter p, we shall always write mv for the mechanical 
momentum from now on. 

To begin with, we generalize the principle of virtual work (p. 58) of statics to 
time-dependent processes, i.e., to d’Alembert’s principle. Here the inertial force 
—d(mv )/dt appears as a new force: 


LE- a -8r;=0, for &=0. 
F dt 

As long as we neglect frictional forces, forces of constraint do not contribute, i.e., Z; - 

or; = 0. Then we only need to account for the remaining forces. For the determination 

of the force of constraint for accelerated bodies, we have to use the expression 

Z = mv — F, and the body presses against the geometrically formulated boundaries 

with the opposite force. 

If, for example, we enforce a curved orbit with the curvature radius R for a given 
velocity v, then according to p. 7, the normal acceleration is v?/R ey. A force of 
constraint equal to m (v?/R) = m œ? R will thus be necessary, if no further force 
acts—only then will the centrifugal force be canceled. Since inertial forces occur 
only for accelerations, they can be taken as fictitious forces, and can be “transformed 
away” in an accelerated reference frame. To do this we generally require curvilinear 


4Joseph Louis de Lagrange (1736-1813) became professor in Turin in 1755, was Euler’s successor 
in Berlin in 1766, and became professor in Paris in 1787. 


94 2 Classical Mechanics 


coordinates—this idea leads to general relativity theory, where we use the fact that 
the gravitational and inertial masses are always equal. 

As long as no forces of constraint occur, we do not need d’ Alembert’s principle, 
as we have seen in the last section. But otherwise this principle is very useful—in 
statics the principle of virtual work may be employed repeatedly. And now we even 
know the generalization to changes in time. 

Correspondingly, we can generalize the Lagrangian equations of the first kind 
from statics (see p. 61) to time-dependent processes: 


d 
F+ > àn VO, = ww) l 


n 


This equation refers to one particle—as in statics it can be generalized to more 
particles. Then further coordinates and masses are involved. 


2.3.2 Constraints 


We already know an example of constraints from the case of the rigid body: instead 
of introducing 3N independent coordinates (degrees of freedom) for N point masses, 
six are sufficient, because for a rigid body the remaining ones can be chosen as fixed— 
clearly an example of “geometrical” constraints. Something like this has already been 
encountered in statics: for a displacement along a line, there is only one degree of 
freedom, for the displacement on a plane there are only two. A constraint is said to 
be holonomic or integrable if it can be brought into the form ® (t, r1, ..., ry) = 0. 
(The Greek holos means whole or perfect, implying that it can be integrated.) If 
the constraint refers to velocities or if it can be expressed only differentially or as 
an inequality, e.g., confinement within a volume, then we are dealing with a “non- 
holonomic” (non-integrable) condition. (Sometimes constraints given as inequalities 
are referred to as unilateral or bilateral, because the forces of constraint act only 
in one direction or two.) If a constraint does not depend explicitly upon time then 
it is said to be scleronomous (skleros means fixed or rigid), otherwise rheonomous 
(rheos means flowing). In statics, we always assumed holonomic and scleronomous 
constraints. They are barely simpler than the differentials—they occur, e.g., when a 
wheel rolls on a plane. Then its rotation is related to the motion of the contact point 
(Problem 2.7). 

Instead of constraints, we can also introduce forces of constraint which ensure 
that the constraints are respected: constraints and forces of constraint are two pic- 
tures for the same situation, because both allow us to deal with the motion of the 
body. However, geometrical constraints are intuitively descriptive, while forces of 
constraint often have to be computed, something that is necessary, however, when 
designing machines in order to determine forces and loads. 


2.3 Lagrangian Mechanics 95 


In general, constraints couple the equations of motion. But for holonomic con- 
straints, the number of independent variables can often be reduced by a clever choice 
of coordinates. Then positions can no longer be described by three-vectors, and 
the coordinates are often different physical quantities, appearing, e.g., as angles or 
amplitudes of a Fourier decomposition. In Hamiltonian mechanics, we may also take 
(angular) momentum components and energy as new variables. 

In the following, we shall neglect kinetic friction. Then the forces of constraint 
do not lead to tangential acceleration, but just a normal acceleration, whence they 
cannot perform work, being perpendicular to the allowed displacements—as long 
as no kinetic friction perturbs the system, we do not need to account for forces of 
constraint in the energy conservation law. 

If the constraints lead to a single degree of freedom and are scleronomous, then the 
energy (conservation) law helps—so instead of one differential equation of second 
order, only one of first order remains to be solved (with the energy as integration 


constant): 
m dx E— V(x) 
E==v +V = = : 
eae di \ m/2 


Of course, there can only be curl-free forces here, otherwise there is no potential 
energy. 


2.3.3 Lagrange Equations of the Second Kind 


For time-dependent problems, we start from d’ Alembert’s principle, i.e., from the 
equation }°,{F; — d(m;v;)/dt} - dr; = 0 for dr = 0. Since 


4 
or; k 
or; = 5 axk dx š 


k=1 


where the 5x* do not depend upon each other (otherwise Lagrangian parameters are 
still necessary)—or in particular if there is only one 5x* 4 0—and since 


_ 2 or; 
R=) OF oy. 
i=1 


we find the equations 


d(m;v;) Ər; 
f= a for ke{1,..., f}. 


i 


96 2 Classical Mechanics 


The right-hand side can be simplified: 


d(mv) or d ( or ) ðv 
a = mv: . 7 
dt axk dt axk ax* 


and since v = f, we also have dr/dx* = dv/dx*, because t is treated as the orbital 
parameter, whence 


d(mv) or d ( Š əv ) = ov 
x = my my A 
dt axk dt axk axk 


But now we have v- dv = 5 d(v - v) = dT/m and therefore, with T = >°, T;, 


y d(m;v;) ari d (2) oT 


dt axk dt \axk)— axk 


Finally, the f equilibrium conditions Fy = 0 can be generalized to 


ar, dyaT\ aT 
ee =) . for ke {l,..., f}. 
‘ 2 T Ce See a P 


These are the generalized Lagrange equations of the second kind. In general, how- 
ever, we also assume that the external forces can be derived from a potential energy: 


or; ie OV 


axk xk * 


F; = —V;: V 1, ..., ry) = FR=-9 VV. 


Since 8V/dx* = 0, we introduce the Lagrange function 

L=T-V, 
and we obtain the Lagrange equations of second kind (Euler-Lagrange equations) 
£ (20) - 2% <0, for ke {l,..., f}. 
dt \ax* axk 


Many problems of mechanics can be solved with these. We need only the scalar 
Lagrange function L and a convenient choice of coordinates. 

Let us consider as an example the plane motion of a particle of mass m under 
arbitrary (but not necessarily curl-free) forces. In Cartesian coordinates x, y, we have 


T = hm (i +5"), 


and consequently, 


2.3 Lagrangian Mechanics 97 


oT : oT ; oT OT 
qe Ses: or SY ss a SG, = 
ox dy Ox dy 


Therefore, for constant mass, the Lagrange equations lead to Newton’s relation F = 
m¥. In contrast, in (curvilinear) polar coordinates r, g, we have 


T= sm E +r o’), 
and consequently, 


oT . oT a. 9 oT 
—=mr, —-~=mro, —=mrg, —=0. 

or dp or 0g 

With F, = F - ðr/ðr = F -r/r and F, = F - or/dg = F -r eg (according to p. 40), 
whence F, = F- (n x r) = (r x F)-n = M - n, we have 


ie 2 d Ds 
F,=mr—-—mr@ and eeu Q). 


According to the first equation for the radial motion, in addition to F,, the centrifugal 
force mro? is accounted for, and for @ we have so far set w, e.g., on p. 91. The second 
equation has been written so far as M = dL/dt, because L - n = mr?ġ. From our 
new viewpoint, it is the same equation as F = d(mv )/dt, only expressed in other 
coordinates. 


2.3.4 Velocity-Dependent Forces and Friction 


For time- and velocity-dependent forces, there is no potential energy and thus also 
no Lagrange function as yet. But in fact, a generalized potential energy U with the 
property 


ene o 


mF xt) , for ke{l,..., f}, 
also suffices, because then the generalized Lagrange function 
L=T-U 
obeys the Lagrange equations of the second kind. 
The most important example is the Lorentz force on a charge in an electromagnetic 


field: 


F=q (E+vxB). 


98 2 Classical Mechanics 


In order to derive this from a generalized potential energy U, we employ the two 
Maxwell equations 


OB 
VxE=-—, V-B=0. 
ðt 
According to this the two vector fields E and B are related to each other and can be 
derived from a scalar potential ® and a vector potential A, taken at the coordinates 
of the test body: 


JA 
meer Ta B=VxA. 


The two potentials ® and A are functions of ¢ and r (but not v ). Hence the position 
of the test body depends upon the time, and therefore total and partial derivatives are 
to be distinguished from each other: dA/dt — dA/dt = (v- V) A. But since r and v 
are to be treated as independent variables, we may set v x (V x A) = V(v- A) — 
(v - V)A, because the terms to be expected formally —A x (V x v) — (A - V) vdo 
not contribute. This leads to 


“*) l 


F=q4( V@-v-A)- 


Therefore, the generalized potential energy for the Lorentz force on a charge q in 
the electromagnetic field is 


However, the potentials are not yet uniquely determined. In particular, we may still 
have gauge transformations: ®' = ® + 3Y /ðt and A’ = A — V Y deliver the same 
fields E and B as and A. This gauge invariance of the fields leads to the fact 
that U’ = q (®’ — v - A’) = U + q dY¥/dt can be taken as the generalized potential 
energy, corresponding to the undetermined Lagrange function (an example is given 
in Problem 2.31) 


; dG 
L = L- —. 

dt 
We will come back to the gauge dependence of the Lagrange function in Sect. 2.4.5. 
There it will also be understood why we write G here instead of q Y, because there G 
is a generating function (generator) of a canonical transformation. (However, here 
G depends only upon ż and x*, while there it may also depend on further variables.) 

For friction there is no generalized potential energy U. Then we have to take 

d /əL OL 
dt (sz) age 


2.3 Lagrangian Mechanics 99 


where f contains all forces which cannot be derived from a generalized potential 
energy U. 

There are many examples where the frictional force is proportional to the velocity, 
i.e., f = —av, which is called Stokes friction, contrasted with Newtonian friction, 
where f « —vv (e.g., for the case of free fall), e.g., laminar flow (only turbulent flow 
leads to a squared term) or electrical loop currents with Ohm resistance. Stokes-type 
friction also occurs in the Langevin equation (Sect. 6.2.7). Then we may set 


f=-V,F , with F= and a>0O, 


where ¥ is Rayleigh’s dissipation function. It supplies half the power which the 
system has to give off because of the friction: dA = —f - dr = —f-v dt = av? dt = 
2 F dt. In this case we also need two scalar functions, L and F, to derive the equation 
of motion (and to describe the thermal expansion). 

But for this friction and 0L/dx = mx, we can also take the Lagrange function 
L exp(at/m) (now time-dependent). The unknown term « x then appears in addition 
to d(dL/dx)/dt — dL/odx. 


2.3.5 Conserved Quantities. Canonical and Mechanical 
Momentum 


The Lagrange function L(t, x, x) contains the velocity in a non-linear way. Therefore, 
the Lagrange equation (without friction!), 


d ( ðL ) = OL 
dt \axk/ ~~ axk ’ 
is a differential equation of second order, because ¥ occurs. We search for “solutions” 


C(t, x, x) = 0, which are differential equations of only first order. 
This is straightforward if L does not depend on x*, but on x* : 


OL 


0 d ƏL 0 OL i 
= ar e = — = const. 
axk dt x 


axk axk 

The assumption 0L/dx* = 0 is justified if we can move the origin of x* with 

impunity, i.e., if we can add an arbitrary constant to x*. For example, the dynamics of 

a rotating wheel does not depend on the angle coordinate g, but only on the angular 

velocity @. Therefore, all coordinates which do not appear in L are said to be cyclic 

—a further important example of cyclic coordinates is given in Problem 2.32. 
Generally, ƏL /ðž* is called the canonical momentum conjugate to x* : 


100 2 Classical Mechanics 


(Here we have the decisive quantity for Hamiltonian mechanics, as we shall see in the 
next section.) A free point mass has L = 4 v - v and p = mv, whence p = VL. For 
rotations, we have L = 51 g” and we obtain Py = Ig as the canonical momentum, 
i.e., the angular momentum, or more precisely, the angular momentum component 
along the corresponding axis of rotation. This holds even if a potential energy V also 
appears. If, however, a point mass with charge g moves in an electromagnetic field, 


then L = z v-v—gq(@®-—v-A), and hence the canonical momentum is 
p=mv+qA. 


It differs from the mechanical momentum mv by the additional term q A and depends 
on the gauge, whence p’ = p — VG is acanonical momentum. 

In the following, p will always denote the canonical momentum and mv the 
mechanical one. Therefore, we may no longer call p a force F, because we have 


d(mv) _ ; 
=bp-4ÀÅ, 
di P—-4 
and according to the last section, 
F=-VU-—gqA. 
Consequently, 
d 
ak =F = p=-vu 


delivers a noteworthy result. 

A homogeneous magnetic field B can be obtained from the vector potential A = 
5 B x r (among others), and since ® = 0, this leads to -VU = Ti v x B. Here p 
is thus equal to half the Lorentz force. 

In a constant and homogeneous magnetic field, which thus does not depend upon 
either t or r, neither the mechanical nor the canonical momentum is conserved. 
However, since mv = qt x B, only the pseudo-momentum 


K = mv+qBxr 
is conserved. In fact, on the helical orbit, only the mechanical momentum in the field 
direction is conserved (Kj = mvj). Perpendicular to it, there is a circular orbit, and 
we use œw = —qB/m from p. 78. Using 


mv, =m@xr+K, =m@ x (r-ra), 


which implies 
K, =mry, X @, 


2.3 Lagrangian Mechanics 101 


we infer on the helical axis 


w K 
r=—X-—_, 
w mo 
and the radius v; /o. 

The canonical momentum conjugate to a cyclic variable is conserved according to 
what was said above, i.e., p = 0L/dx = 0. Therefore, for (infinitesimal) translational 
invariance, the momentum is conserved, and for isotropy (rotational invariance), the 
angular momentum is conserved. 

If L does not depend explicitly on time, then, according to the Lagrange equation, 
we have 


dL ƏL 4 ƏL dx = at a d M 
A tar I=L E = Limes 


Thus, for dL /dt = 0, the sum `, px x* — L is also a conserved quantity (constant 
of the motion). Here X, px x* is equal to 2T if the kinetic energy T is ahomogeneous 
function of second order in the velocity, thus if T (kv ) = k? T(v) holds for all real k, 
which according to Euler’s identity for continuously differentiable T is equivalent to 
v-V,T(v) = 2T (y). (For time-independent constraints, but not for time-dependent 
ones, T is homogeneous of second order.) Thus for OL /dt = 0 and 2T = v-V,T, 
the quantity 2T — L is conserved. If there is in addition a potential energy V, then 
L =T — V and the energy T + V is conserved. 


2.3.6 Physical Pendulum 


Here we discuss a rigid body of mass m with moment of inertia J with respect to a 
(horizontal) axis of rotation: a plane pendulum. (A rotational pendulum would move 
freely about a point, as discussed in Sect. 2.4.10. A mathematical pendulum is a 
point mass which moves. It has J = m s?, but is otherwise not easy to treat. On the 
other hand, friction is neglected for the time being. It will be accounted for in the 
next section.) The angle 0 gives the displacement from the equilibrium position (see 
Fig. 2.15). 
In this situation, the kinetic and potential energies are 


T=Ł}I and V=21o sin? l0, witho = 2". 


As in the case of free fall (Sect. 2.2.8), we have assumed here that the gravitational 
field of the Earth is homogeneous. As an aside, formally the same expression holds 
for an electric dipole moment p in a homogeneous electric field E, because there 
the potential energy is V = —p- E (see Sect. 3.1.4), and for a magnetic moment m 


102 2 Classical Mechanics 


V/2mgs 


-T 0 T 0 


Fig. 2.15 Plane pendulum. The center of mass (full circle) is a distance s from the axis of rotation 
(open circle) and a height h = s (1—cos 0) = 2s sin? 50 above the equilibrium position. Right: 
Potential energy V (relative to 2mgs) as a function of 0. Dashed curve: Approximation for harmonic 
oscillation 


in a homogeneous magnetic field B, since V = —m - B, according to Sect. 3.2.9. 
The following considerations can also be transferred to the pendulum motion of an 
undamped compass needle, because for V, the origin is not important and Tœ? is 
then the product of the dipole moment and the field strength. 

As stressed in Sect. 2.3.2, for such problems with a single unknown and time- 
independent energy T + V, conservation of energy is useful: 


E =21{(40)° + œ sin? 50}. 


According to this, Gô)? = E/2I — œ? sin? +0, which is a differential equation of 
first order for the unknown function 6 (t). 

Small pendulum amplitudes are generally considered, and we may set sin 50 x 
50. This leads to the differential equation E = HI (62 + w0?) for harmonic oscil- 
lation, viz., 


O(t) = Oo cos(wt) + (6y/@) sin(wt) = f cos (wt —@), 


with the initial values 0 (0) = 0) = 8 cos ġ and 6(0) = = wð sin ġ. The amplitude 
@ then follows from @2 = 2E/Tw* = 0° + (6y/@)? and the phase shift (at zero 
time) @ from tan so = @-— 0o) w/o. Note that we use the equation tan so =(1- 
cos @)/ sing, not the more suggestive tan ġ = sin ġ/ cos ġ, because this gives @ 
uniquely only up to an even multiple of z. As the integration constant we thus have 
either the energy E (or, respectively, 0) and the phase shift ¢ or the initial values 6o 
and Êo. 

However, we would like also to allow for larger pendulum amplitudes, and for 
that we use the abbreviation (with k > 0) 


R= E E 
~ 21w? 2mgs 


2.3 Lagrangian Mechanics 103 


Fig. 2.16 Pendulum trajectories in phase space (y œ p). These are solutions of the equation y? + 
sin? x = k*, here for k? from 0.2 to 1.8 in steps of 0.2 and a periodicity interval — in Sx < 50. 
Thus x = 50 and y = %/w. The dashed red curve (k? = 1)is the separatrix—it separates the rotating 
solutions (green) from the librations (blue). The curves are always plotted clockwise ©: for x > 0, 
the velocity decreases (¥ < 0), for x < 0, it increases. This happens also for the damped oscillation 
(see Fig. 2.21) 


We then have the non-linear differential equation k? = w~*x? + sin? x. So far we 
have restricted ourselves to k < 1 and so have been able to use sinx % x. In this 
way, in the (x, x)-plane, we obtained an ellipse with semi-axes k and kw. With 
increasing k (<1), the ellipse increases in size and changes shape—in fact, it no longer 
remains an ellipse. For k = 1, we have x = +w cos x. The requirement | sin x | < k 
limits the x values for k < 1 (then k = sin 16), but no longer for k > 1. Hence, 
the pendulum rotates (see Fig. 2.16). In all cases, the highest angular velocity is 
Omax = 2wk = ./2E/T. For k > 1, the term sin? x is negligible compared to w?x? 
and the pendulum then rotates with constant angular velocity Â. 

In the differential equation x7 = œ? (k? — sin? x), the variables can be separated: 


w dt = ed : 
k? — sin? x 
We first consider the oscillations (the case k < 1) and then the rotating solutions 
(k > 1). In both cases, we choose the zero time (i.e., the second fitting parameter in 
addition to k or E) at 0 (0) = 0 (with ô > 0). 

For k < 1, we transform sin x = k sin z, thus cos x dx = k cos z dz: the denom- 
inator becomes k cos z and dx/y k? — sin? x becomes dz/vy 1 — k? sin? z. Then we 
arrive at the incomplete elliptic integral of the first kind (in the Legendre normal 
form) 


? d 
Fío |k?) = f o 
0 y1-—k?sin?z 


and hence, 


104 2 Classical Mechanics 


0 * 
0° 30° 60° 90° 120° 150° 180° @ 


Fig. 2.17 Dependence of the oscillation period T on the pendulum amplitude, here in relation 
to the oscillation period Tọ = 2m /œ for small amplitude. Dashed blue curve: Limiting curve 
(2/7) In(4/cos 50) for large amplitude. Continuous red curve: Complete elliptic integral of the 


first kind K(sin? 50) up to a factor in (see Fig. 2.33) 


sin(30) 


wt = F(aresin |k?). 


This equation yields the oscillation period T (see Fig. 2.17), because for IT, we 
have sin 50 =korg= ir: 


toT =F(4m|k?) = K(k’). 


Here K(k?) is a complete elliptic integral of the first kind. (More details on the special 
functions mentioned here can be found, e.g., in [1], or in particular [2].) 

The Legendre normal form of the the elliptic integrals mentioned here depends 
on a circular function. If we take sin z as integration variable ¢, then the incomplete 
elliptic integral reads 


sing dt 


0 va-l- ke) 


F(p |k?) = 
and the complete elliptic integral 


K(k?) = 


f dt 
0 VAZA RA 


Thus we only need a purely algebraic integrand. 

If the pendulum oscillates with small angle amplitudes, then k? ~ 0. If we expand 
the integrand for k? < 1 in terms of powers of k? and integrate term by term, this 
yields 


2.3 Lagrangian Mechanics 105 


Fig. 2.18 The amplitude of p = am F 
the elliptic functions, g = 7 
am F, during a quarter period 2 
for k? = 0 (black), 0.5 (red), 
0.9 (blue), and 0.99 (green). 
This is needed for the Jacobi 
functions, e.g., sine 
amplitudes (see Fig. 2.31). ir 0.99 0 
The dependence of the 
inverse functions F(g) can 
also be read off 
0 
0 5K K F 
CO 
2 m Qn)! on 2 
K(k?) = 2 ants * , fork? <1, 
n= 


and thus T = 27@7! (1+ tk? + okt +.---).Only for amplitudes larger than 23° 
does the bracket deviate by more than 1% from unity. In the special case k* = 1, the 
oscillation period T increases beyond all limits, because for k’ = /1 — k? < 1 itis 


CO _1 4 2n _yj-l 4 
Ke) = >( e$- (=) \e” = In — + Pa 


n=0 jel J 


as proven in Fig. 3.14. We shall use these relations in electrodynamics. 

In order to obtain the amplitudes as a function of time, however, we also need 
the inverse functions of the incomplete elliptic integrals of the first kind, namely the 
(angle) amplitude of F (see Fig. 2.18): 

t = Flg | k3 = gy =am(t|k*) =amt. 
Then our result with t = wt can be brought into the form (see Fig. 2.19) 


sin(50) =ksin(amt) = ksn(t | k?) =ksnt. 


The Jacobi elliptic function sinus amplitudinis sn t arises. It is odd in t and, like all 
elliptic functions, it is doubly periodic, if we allow for a complex arguments: 


snt = sn{t +4K(k*)} = sn{t + 2iK(1—k’)}. 
For k? = 0, it is sin t, and for k? = 1 (with K —> oo), it is tanh T. 


For the rotating solutions (with k > 1), the calculation is easy, because here, 
even without the above-mentioned transformation x —> z, the differential equation 


106 2 Classical Mechanics 


Fig. 2.19 Pendulum A(t) 
amplitude 0 for one period 
when k? = 0.5 (red), 0.9 
(blue), and 0.99 (green) 


T 


0 
l t/T 


ædt = dx/Vk2 — sin? x = k`! dx/y 1 — k~? sin? x leads to an incomplete elliptic 


integral of the first kind: 


F(46 | k? 
t= “a and = 46 =am(kot |k?) . 


For 6 = xz, we have half a rotation and the time K(k~7) /ko. 


2.3.7 Damped Oscillation 


If we had restricted ourselves to small displacements above, then we would still have 
had the simple differential equation for the harmonic oscillation: 


ï+ x=0. 


Multiplying by x and integrating over t, we deduce the “conservation of energy”, 


x? + wo? x? = const. 


But the harmonic oscillation can also be perturbed by other additional terms—in 
particular, it normally decays, i.e., it is damped. We now write wo for the w used so 
far, because the angular frequency of the oscillation depends upon the damping, as 
we shall see shortly. 

We assume Stokes’s friction because only comparably small velocities occur and 
therefore a term linear in x will contribute more than a squared one. We thus consider 
the differential equation 


¥+2ye+aox=0, with y>0. 
In the solutions, y can be viewed as the decay coefficient and y7! as the decay or 
relaxation time. Because of the damping, the conservation of energy does not help, but 
because the linear differential equation is homogeneous, the ansatz x = c exp(—iat) 
leads to (see Fig. 2.20) 


2.3 Lagrangian Mechanics 


107 


Fig.2.20 Dependence of the pair w+ in the complex w-plane with increasing damping. Fory < wo, 
they start from +woọ (*) and move symmetrically towards each other on a semi-circle (from * to o 
to +) until, for y = wo, they coincide at —iwo (e). Because |w+w-| = wo2 for y > wo, they move 
apart again as mirror points (x) of the circle on the imaginary axis. Damped oscillations occur only 


for negative imaginary part 


Fig. 2.21 Damped 
oscillations for y = wo/10. 
As in Fig. 2.16, x/ao is 
represented as a function of 
x, with equal time intervals 
between neighboring points 
(e). For other initial values, 
the figure is rotated about the 
origin (o), where all orbits 
end. This is the attractor of 
all orbits 


wo +2iy w= wo 


£/wo 


In the following, the angular frequency 


will be useful, because w+ 


=> œ= +V — y? -—iy. 
2 = Via? — 77 
= +Q — iy for y < w and w = —i (y F Q) with y > 


Q for y > wo. 


Hence we have two linearly independent solutions exp(—iw t). Note that, for 
y = wọ, the two solutions x+ coincide, but their difference at the transition y > wo 
is to a first approximation proportional to t exp(—yt), which then delivers a linearly 
independent solution. Therefore, we can adjust x(t) to the initial values xo and xo 


(see Fig. 2.21): 


108 2 Classical Mechanics 


£ / Wo x / Wo 


Fig. 2.22 Left: Critically damped oscillation (y = wo). Right: Supercritically damped oscillation 
(with y = 2wo). The representation is the same as in the last figure, except that here the trajectories 
depend upon the initial conditions, but all finish at the origin 


Y<@: x = exp(—yt) (xo cos Qt + Adres yo sin 2r) : 


y =œ: Xx =exp(—yt) (xo + (žo + y xo) t) , 


y > wg: x = exp(—yt) (xo cosh Qt + Sala sinh ar) ; 


Except for the exponential factor in front of the brackets, the last two brackets no 
longer describe periodic motion. What we have here is in fact aperiodic damping: 
for y = wọ, critical damping, for y > wọ, supercritical damping (see Fig. 2.22). 


2.3.8 Forced Oscillation 


For the time being, we assume a force acting periodically with angular frequency w 
and consider the inhomogeneous linear differential equation 


ï +2yž +wx = c cos(ot) . 


On the right-hand side, we could could have assumed a Fourier integral, and then 
we would have to superpose the corresponding solutions. The general solution is 
composed of the general solution of the homogeneous equation treated above and 
a special solution of the inhomogeneous equation. The special solution describes 
here the long-time behavior (with y > 0), because the solutions of the homogeneous 
equation decay exponentially in time—they are important only for the initial process 
and are needed to satisfy the initial conditions. 


2.3 Lagrangian Mechanics 109 
For a special solution, we make the ansatz 
x = C cos(wt — ¢) = C [cos¢ cos(wt) + sing sin(wt)] . 


Hence, for @ 4 0, the solution is delayed with respect to the exciting oscillation. 
Therefore, we set wt — ¢ and expect @ > 0. In order to ensure that ¢ is unique (mod 
27), we require that C should have the same sign as c. With this ansatz and after 
comparing coefficients of cos(wt) and sin(wt), the differential equation leads to the 
conditions 


(wp — w*) coso + 2ywsing =c/C >0, 
l — w) sing — 2ywcos = 0, 


which we can solve for the unknown C and @. For unique determination of ¢, we 
first consider w = wo and find here ¢ = ir mod 2x. Hence, 0 < ¢ < m has to 
hold. Therefore, we derive @ from tan Gr — ) = coto = (a — a?) /2yw and 


use sing = 1/y 1 + cot? ¢ for c/C (see Fig. 2.23): 


c T W — @ 
and g= 5 arctan — 


C= 
V (@o2 — Ww?) +4y202 2yw 


For w ~ wọ, the ratio C/c is very large. For y Æ 0, the maximum lies at somewhat 
lower frequencies than wọ. However, for larger amplitudes, the starting equation is 
no longer valid, because then the free oscillation becomes anharmonic. Note also 
that the phase shift @ increases with w. For œw < wo, it is negligible, for @ = wo, it 
takes the value ix, and for wœ >> wo, it tends to x. The higher the driving frequency, 
the more the forced oscillation is delayed, until it finally oscillates in opposite phase. 
This transition from in-phase to opposite-phase becomes ever more sudden with 


decreasing damping y. 


0 1 2 3 w/w 


Fig. 2.23 Forced oscillation. Left: Amplitude of the ratio œo? C/c as a function of w/wo. Right: 
Phase shift @ for y = 0.1 wo (continuous red curve) and for y = 05 wo (dashed blue curve) 


110 2 Classical Mechanics 
Somewhat more concise is the treatment using complex variables. The ansatz 
x = Re{@exp(—iat)}, with @ = Cexp(id) 
leads via the differential equation to (wy? — w — 2i yo) @ = c. With 
w+ = toy — y? -iy 


from the last section, or w + 2i yw — wọ = (w — @+)(@ — w_), we then arrive at 


_ c _ c ( 1 1 
— (@—@_)\(®,-@) wp — o w — o ol 


For y 4 wo, the amplitude @ thus has two simple poles below the real w-axis (see 
Fig. 2.20). This representation is particularly suitable when the driving force is not 
purely harmonic and therefore has to be integrated (according to Fourier)—this is 
then straightforward using the theorem of residues. 

In addition, in many cases it is not only the long-time behavior that is of interest. 
Therefore, we still wish to generalize the previous considerations. To this end, we 
shall transform the inhomogeneous linear differential equation 


K(t) + 2vi(t) + o0 xt) = fO, 


with a Laplace transform, viz., 
[o0] 
x — L{x}= f exp (—st) x(t) dt , 
0 


into an algebraic equation [3], where Res > 0 has to hold. Naturally, the solution 
here still has to undergo the inverse Laplace transform. Note that the great advantage 
of the Laplace transform over the similar Fourier transform is the fact that only one 
integration limit is unrestricted. The Laplace-transformed derivative x is equal to 


Lix} = s L{x} — x(40) , 


since partial integration delivers fJ e~"x dt = eee +s fo? e%'x dt. The 
region t < 0 is of no interest. Hence for t = 0, x may even jump from x(—0) to 
finite x(+0). Since P{*%} = s (s L{x}—x(0))—x(0) and s? + 2ys + wo? = (s + 
1@+)(s +iq@_), the original differential equation leads to 


Lx) = Lf} + (st2y) x(0) + x(0) 
E (s +i@)(s +iw_) 


The result may also be written as 


2.3 Lagrangian Mechanics 111 
Lix} = L{xo} + L(g} {Hf}, 


with 


Lig) = -T a 


(stio,)(stio_) 2 wo? — y? \s tio, sti 


if xo(t) solves the associated homogeneous differential equation under the given 
initial conditions: x + 2y žo + @o7xo = 0, along with x9(0) = x(0) and x9(0) = 
x(0). According to the last section, we can determine this auxiliary quantity. 

The product of the Laplace-transformed -X {g} - -Z {f} comes from a convolution 
integral: 


x(t) = xo(t) +f g(t—t') f(t’) de’. 


Since we are only interested here in 0 < t’ < t, we may then amend both functions g 
and f so that they vanish for negative arguments. Then we may also integrate from 
—oo to +00. This leads to the convolution theorem, as for the Fourier transform on 
p. 22, because the Laplace transform 


wir)= ff exp(—st) g(t — t^) f(r’) dt der" 


arises for the function F(t) = JS g(t —t') f(t) dt’. And because we have 
exp(—st) = exp{—s(t — t')} exp(—st’) with the new integration variables t = t — t’ 
(and equal integration limits for t and f), this double integral can be split into the 
product of the Laplace-transformed functions of g and f, as required. 

In order to determine g, we compare the expression {(s + iw )(s +i@_)}"! for 
L {g} with that for {x}. The two Laplace-transformed functions are apparently 
equal, if x(0) = 0, x(0) = 1 holds and f vanishes—the oscillation is not forced. If 


we set T = t — t', then for g(t), the constraints are 
g(0)=0, g(0)=1, and $+2ye+aog=0. 


Consequently, according to the last section, we already know g(r). In particular, 
we have g(t) = exp(—yt) Q7! sin(Qt) with Q = yœ? — y? for y < wo (see 
Fig. 2.24). 

Note that the integral often extends to oo, where g(t) then has to vanish for t < 0. 
This function remains continuous, but its first derivative at t = 0 has to jump from 
zero to one. This leads to the differential equation g(t) + 2y g(t) + og (t) = ê(T), 
thus the starting equation with f(t) = (t) as inhomogeneity. Generally, solutions of 
linear differential equations with the delta function as inhomogeneity are called Green 
functions. Using these, the solutions for other inhomogeneities can be represented 


112 2 Classical Mechanics 


Fig. 2.24 Green function g(t) (for the damped oscillator). For t Æ 0, it satisfies the homogeneous 
differential equation ë + 2y 8 + wo?g = 0. For t = 0, its first derivative jumps by one. Hence, the 
second derivative is given there by the delta function 5(t) 


as convolution integrals (Problem 2.38). We encountered the Green function for the 
Laplace operator on p. 26, and the one here will be generalized in Sect. 2.3.10. 

If for finite damping only the long-time behavior is of interest, then we may leave 
out x9(t) and take —oo as the lower integration limit. Then we arrive at a convolution 
integral from —oo to +00. 


2.3.9 Coupled Oscillations and Normal Coordinates 


So far we have restricted ourselves to oscillations of just one coordinate. Now we 
consider several coordinates (f > 1), e.g., a double pendulum (one hanging from the 
other) or several point masses coupled to each other by springs (atoms in a molecule 
or in a crystal). Here we start from a conservative system with the potential energy 
V(x!,...,x/) and choose the origin of all f coordinates x* in their equilibrium 
position. Then all forces vanish: 


ƏV 


Peo 
í ax* lo 


=0, for ke{l,..., f}. 

We assume a stable equilibrium, i.e., small displacements from the equilibrium cost 
energy. Then the extremum of V has to be a local minimum, and for the corresponding 
gauge, according to Taylor, we have 


V= areh” x = i) Aux x, with Axu = Aik = Axi” , 


if we neglect higher-order terms—the pendulum is just barely displaced, and no 
anharmonic forces act between the atoms. Here the coefficients do not depend upon 
the time ż. 

In addition, we need the kinetic energy, for which we make an ansatz of the form 


= D9 By xk x! , with By = Be = Bu* , 


2.3 Lagrangian Mechanics 113 


where it is assumed that these coefficients do not depend on time (which is approx- 
imately true only occasionally). In any case, no linear terms in ž* should appear, 
because otherwise they would change sign for t > —t. 

For k € {1,..., f} and since Ay = Aj, and By = By, the Lagrange equations 
now deliver 


d oL ðL “f 1 
0= ~ age na +2 Aux ; 


~ dt axk 
If we take A and B as square matrices and (x!,..., x/) as a row vector X, we then 
have 
V=4X¥Ax and T=}%Bi, 
and also 
Bx+Ax=0, 
or x = — B`! A x. For one degree of freedom (f = 1), we could have written simply 


œ? instead of the matrix product B~! A. 

Now we would like to make a transition to new coordinates, called normal coor- 
dinates x’, relative to which the matrices A and B become diagonal, the oscillations 
thus become decoupled, and the solutions are already known. The total energy is 
then the sum of the energies of the individual decoupled oscillators. 

If we set 


x=Cxi => V=4xCACx’ and T=4x'CBCi', 
then we search for a matrix C, which diagonalizes CAC as well as CBC. Here we 
choose the free factor—only the product Cx’ is fixed—such that CBC = 1 holds. 
Then the diagonal elements A of CAC are the squares of the angular frequencies. 
These are the frequencies with which the normal coordinates oscillate. The ampli- 
tudes and phases are adjusted to the initial values. 

In this case, Ax becomes ACx’, and with x’ = —Ax’, BX becomes —ABCx’. 
The vector Cx’ will be denoted by c and we shall seek f such column vectors and 
combine them to form the matrix C. Finally, from Bx + Ax = 0, we have 


(A-AB)c=0, with A=A*=A, B=B*=B, c=œ. 


The homogeneous linear system of equations (A — AB) c = Ois an eigenvalue prob- 
lem, because it is soluble only for suitable eigenvalues àx. With these, we determine 
the eigenvectors cg. Despite the fact that in general the number of degrees of freedom 
is f Æ 3, this eigenvalue problem differs from that of the principal axis transforma- 
tion for the moment of inertia in Sect. 2.2.11 in that B was a unit matrix there. 


114 2 Classical Mechanics 


The eigenvalues can be determined from the characteristic equation 
det (A —AB)=0. 


Since we are working with Hermitian matrices with f rows, there are f real eigen- 
values A, and associated eigenvectors cg, which then follow from 


(A-A,B) cq, =0. 


These eigenvectors are determined only up to a factor, which we shall soon choose 
in a convenient way. So if we combine the total set of f column vectors {cg}, each 
with f components, to form an eigenvector matrix C = (c,..., Cf), We arrive at 


ČBC=1. 


With the help of the kth diagonal element of this matrix and an appropriate “nor- 
malization factor” (a scale transformation), we can choose the kth (row and) column 
vector and make all non-diagonal elements—in different row and column vectors— 
equal to zero. This can be seen immediately for different eigenvalues Ay A Ax, 
because (Ax — Ax’) €k B cy is the same as Cy (A — A) cy. But for equal eigenvalues 
Ax = Ay (degeneracy) all linear combinations of these eigenvectors are still eigen- 
vectors, and this freedom can be exploited for CBC = 1. The matrix CAC = Ais 
then also diagonal. Thus we have 


2T =x Bx =x’ CBC =K'1x', 
2V=%XAx =x'CACx' =x’ Ax’. 


In the new coordinates, the kinetic and potential energy no longer contain mixed 
terms. The f harmonic oscillations are decoupled in the normal coordinates. 

The eigenvalues 4 of A are the squares of the desired angular frequencies because 
they represent the harmonic oscillation in the expression sm (x? + œx?) for the 
energy. 

For example, for two coupled oscillations (f = 2), we thus arrive at the eigen- 
frequencies 


> K+tVK?—4 detA det B 
oO. = 7 dea B , with K = Ai, Bn + An Bıı —2A 2B. 
e 


To these belong the two eigenvectors c+, each with two components, whose ratio is 


2 
C+ oa Ai — Wt By 


— EJ 
Cit A2 — 047 B2 


2.3 Lagrangian Mechanics 115 


and which are normalized via cj.~” = B11 +2By2 (c2+/c14)+ Bro (Crt. /c14)*. With 


this, 
Ci+ C- 
C= I+ C1 
C24 C2- 
and its inverse matrix (see p. 71) can be calculated. In normal coordinates, the solu- 
tions read 


Xo+’ 


x4 = xo4 cCos(@w4t) + 


sin(w+t) , 


w 


where the coefficients xo+’ and X9+’ follow from the initial conditions: 


xo = C! xo and žo = Cl žo ‘ 


Note that, according to p. 102, we may thus write also x+’ = x,’ cos(w4t — ox). 
Since all unknown quantities have then been determined from the matrix elements 
of A and B and from the initial values, the solution x = Cx’ can finally be calculated 
(Problems 2.39-2.42). 

If the two eigenfrequencies are nearly equal (w, ~ w_), then beats are formed, 
i.e., the oscillation amplitudes change periodically, and this all the more clearly as 
the amplitudes are close to one another. From 


Xi = C144" cos(wyt — p+) + ci-X—' cos(w_t — @_), 


with œ+ > w_ and positive amplitude c;+xx.' abbreviated to C;+, together with the 
notation w+ = Q + w and di = ¢ + ọ, it follows that 


xi = +(Ci+ + C;_) cos(Qt — p) cos(wt — p) 
—(Ci+ — Ci-) sin(Qt — ġ) sin(wt — @) . 


Since w4 © w_, we have w<« Q, whence the amplitude of the oscillation changes 
periodically with the angular frequency Q according to 


V(Ci4 — Ci)? + 4C;4.C;_ cos? (wt — p) . 


Examples are shown in Fig. 2.25. 

If one of the eigenvalues is zero, then this is not an oscillation, but free motion. If 
no external forces act, in a first step we separate out the center-of-mass motion and 
also the rotation of a rigid body. The following considerations are necessary only for 
the relative motion. 


116 2 Classical Mechanics 


Fig. 2.25 Examples of the displacements of two coupled oscillations as a function of time f during a 
period T. Left: Equal amplitudes of cx’. Right: For the amplitude ratio 1:2. In both cases, p+ = 50 
and 8w4 = 9w_. The oscillation amplitudes are shown by dashed lines 


2.3.10 Time-Dependent Oscillator. Parametric Resonance 


If the parameters kept fixed so far are assumed now to change rhythmically with 
time, then this affects the stability of the system. This is observed for a child’s swing, 
where the moment of inertia fluctuates in the course of time, and for a pendulum 
if its support oscillates vertically up and down. In both cases we encounter Hill’s 
differential equation 


X¥+f@x=0, wih f@0+T)= f) = fO, 


which we shall discuss now. We shall often write œ? instead of f, even though f 
may also become negative. In the end, generalizations of the functions cos(wt) and 
sin(wt) are obtained, which belong to constant f > 0. Incidentally, Hill’s differential 
equation also arises in the quantum theory of crystals and in the theory of charged 
particles in a synchrotron with alternating gradients, although ¢ is then a position 
coordinate. The Bloch function is encountered in the context of a periodic potential. 

We take the two (presently unknown) fundamental solutions x; and x2 with 
the properties xı (0) = 1 = x2(0) and x,;(0) = 0 = x2(0). Their Wronski determi- 
nant x\X2 —X,X2 has the value 1 for all f, this being the value for t = O and 
a constant, because its derivative vanishes. All remaining solutions of the differ- 
ential equation can be expanded in terms of this basis. We clearly have x(t) = 
x(0) xı (t) + x(0) x2(t) since this expression satisfies the differential equation and 
the initial conditions. We may thus write 


i) = es a) o) = Ut) E 
x(t) Xi(t) x2(t)) 0O) x(0) } ’ 


2.3 Lagrangian Mechanics 117 


and with this obtain an area-preserving time-shift matrix U (t) (since det U = 1). 
We would now like to exploit the periodicity—so far we have not used it for the 
time shift and, of course, we could also have introduced the matrix for other factors f. 
The Floquet operator U(T) will be important for us: for given initial conditions 
it delivers x(T) and x(7), and we have 


xt+T) =x xa) +T) xn), 


because this expression satisfies the initial conditions and, since f(t) = f(t +T), 
it also satisfies the differential equation. 

Therefore, we look for the eigenvalues o+ of U (T). For a2 x 2 matrix U, they 
follow from o? — otrU + det U = 0 and, because of det U = 1, satisfy the equa- 
tions o,o_ = 1 and o} +o_ = trU. We thus set o} = exp(+i¢@) and determine @ 
from trU = 2 cos ¢, which is uniquely possible only up to an integer multiple of zr. 
However, we require in addition that ø should depend continuously on f, and set 
¢ = 0 for f = 0. (For f = 0, we have trU = 2 because x; = 1 and x2 = t.) Since 
xı and x are real initially and remain so for all times, trU will also be real. Since 
cos(a + if) = cosa cosh £ — i sin g sinh £, this means that either @ has to be real 
(6 = 0 for |trU| < 2) or its real part has to be an integer multiple of x (a = nz for 
|trU| > 2). For |trU| < 2, we thus have |ox| = 1, and for |trU| > 2, it is clear that 
|oi| Æ 1. (We will return to the degeneracy for |trU| = 2.) 

For the two eigensolutions (Floquet solutions), we have xi(t + T) = o+x4 (t). 
Then for |trU| > 2, their moduli change by the factor |o+| 4 1 for each additional T. 
Fort — oo, one of them exceeds all limits, while fort — —oo, itis the other that does 
so. Therefore, they are said to be (Lyapunov) unstable. For |trU | > 2 all solutions of 
the differential equation are unstable, because they are linear compositions of both of 
these eigensolutions. In contrast, for |trU| < 2, the eigensolutions change only by a 
complex factor of absolute value one with the time increment T—here all solutions 
are stable, and we may choose x_ = x,*. 

Except for the factor o4!/T = exp(+igrt/T), the Floquet solutions have period 
T = 27x / Q and can therefore be represented by a Fourier series or a Laurent series. 
These solutions are linearly independent if there is no degeneracy. For degeneracy 
(\trU | = 2 or o+? = 1), there are stable as well as unstable solutions: x(t) = Q(t) + 
t P(t) with periodic P and Q (for o4 = +1 with period T, for ox = —1 with period 
2T). Here the differential equation for x can be satisfied if Ë + f P = 0 and O + 
fQ= —2P. The expansion coefficients in the Fourier series depend on the function 
JO: 

The special case f(t) = IVa — 2q cos Qt), Mathieu’s differential equation, 
has been thoroughly investigated (see, e.g., [1, 3]). It also arises in the separa- 
tion of the wave equation (A +k?) u = 0 in elliptic coordinates, where only peri- 
odic solutions make sense—and they then acquire the special eigenvalues a(q). The 
curves a(q) (see Fig. 2.26) separate the regions of stable and unstable Mathieu 
functions—thus also allowed and non-allowed energy bands in crystal fields with 
the potential energy V(x) = Vo cos(kx), because there we have a = 8m E / (ñk)? and 


118 


Fig. 2.26 Stability chart of 
the functions solving the 
special Hill differential 
equation X + (42)? (a — 
2q cos Qt) x = 0 , here for 
0 <q <8 and 

—5 <a < 15. Curves 
indicate the stability limits. 
For q = 0, we must have 

a > 0, while for q > 0, the 
region splits into bands 
which become ever narrower, 
but also allow fora < 0 


2 Classical Mechanics 


or 


q 


-0 


q =4mVo/ (hk)* (see Fig. 2.27). The computation of the Mathieu functions and their 
stability chart is explained in more detail in Sect. 2.4.11. 

Simplifications are generally available if f is an even function, thus if f(—t) = 
Ff (t) holds. In particular, x, is then even and x2 odd, whence x(T — t) = x(T) x, (t) — 
x(T) x2(t). If this is used for t = T for the two fundamental solutions x, and x2, 
we obtain x2(T) = xı (T). For even f, we thus have cos @ = xı (T). Therefore, the 
solutions for |xı(T)| < 1 are then stable and otherwise unstable. In addition, not 
only x(t) but also x(—t) now solves the given differential equation. Therefore, we 
may now also set x_(t) = x;(—t) and P_(t) = P,(-1). 


-3 


Fig. 2.27 Real part of the Mathieu functions x+ for 0 < t < 8T and a = 0, for q = 1/4 (dotted 
curve), q = 2/4 (continuous curve), and q = 3/4 (dashed curve) 


2.3 Lagrangian Mechanics 119 
Finally, we consider the weakly time-dependent oscillator: 
f(t) =a {I+ ea(t)}, with e<1 and alt +T) = a(t). 


Here trU ~ 2 cos(woT) holds, whence ¢ © woT. Therefore, the stability is only at 
risk if @pT is an integer multiple of x, thus if the period T of f or a is a half or 
integer multiple of the period To = 27 /w of the basic oscillation. Even for very small 
fluctuations in the moments of inertia, an (undamped) swinging effect comes about. 
This instability is called parametric resonance. It is particularly pronounced for T = 
iTo because, according to Fig. 2.26, the first unstable band for a ~ 1 is particularly 
close to the axis q = 0 and ever smaller for the higher ones (a ~ 22,32... .): when 
swinging on a child’s swing, we must move on the way back and forth, and anyone 
who does that too rarely will not get into motion, whatever the effort. 

Our starting equation also holds for a linear frictional force. Hence, if we start from 
j+2yy+h(t) y = Oand set y = exp(—yt) x, then with f = h — y?, we arrive at 
the starting equation. Naturally, the factor exp(—y t) strengthens the stability, because 
y is positive and only t > 0 is of interest. Now the solutions with |Im@| < yT are 
still stable. 

For a forced oscillation y + 2y ý + h(t) y = f(t), we may make the ansatz 


y(t) = yo(t) +f g(t, t) f(t’) det’ 


for the solution. If h did not depend on t, we might simplify the Green function g(t, t’) 
to g(t — t’), as was shown in Sect. 2.3.8. Correspondingly, we now have to require 
g(t,t) = 0, g(t, t) = 1, and + 2yg+hg=0, for 0 < t’ < t. If we replace (as 
there) the upper integration limit by oo, then g(t, t’) = 0 has to hold for t >t, 
and therefore ë + 2y ¢+hg = ô(t — t') must be valid. If x; and xz are linearly 
independent solutions of the homogeneous differential equation ¥ + (h — y?) x = 0, 
then all these requirements can be satisfied with 


xı (t) xo(t ) — x1 (t ) x2 (t) 


- - , fort > ¢’ (zero otherwise) . 
xy (t') X(t") — X10’) x20’) 


g(t, t) = exp{—y (t-1')} 


In particular, for t 4 t’, this expression satisfies the differential equation, it vanishes 
fort = t', and its first derivative with respect to t makes a jump there from 0 to 1. The 
above-mentioned Wronski determinant x|x2 — x|x2 appears in the denominator. 

Incidentally, g(t, t’) does not need to vanish for t < t’, if we account for the 
contribution to the initial values y(0) and y(O) (thus modify yo). The Green function 
only has to satisfy the differential equation g + 2yg + hg = 5(t—t'). This can be 
done with 


xı (t<) x2(t>) 
x1X2 = X 1X2 , 


g(t, t) = exp{—y (t—t')} 


120 2 Classical Mechanics 


where ft. is the smaller and t, the larger of the two values ¢ and tr’, and again the 
Wronski determinant appears in the denominator. Here, g jumps by 1 for t = t’, but 
is not zero at lower t. 


2.3.11 Summary: Lagrangian Mechanics 


In Sect. 2.1, we already anticipated some important aspects of Lagrangian mechanics, 
although we restricted ourselves there to time-independent phenomena. Geometric 
constraints can often be incorporated through the use of appropriate coordinates in a 
simpler way than by the associated forces of constraint. In particular, it is often the 
case that fewer variables (generalized coordinates) depend on the time—otherwise 
the constraints have to be accounted for by Lagrangian parameters in the Lagrange 
equations of the first kind. To this end, we generalized the principle of virtual work 
to d’Alembert’s principle by taking into account inertial forces. 

With a convenient choice of coordinates, we have an N-body problem in the “con- 
figuration space” with f(<3N) dimensions and Lagrange equations of the second 
kind 


d oT oT 


t= a oak Oak” 


Here, the generalized forces Fy = 7, F; - 0r;/ dx* are often derived from a potential 
energy. The forces may even depend upon the velocity, since there may also be a 
generalized potential energy U with the property 


d dU 0U 


dt ak axe” 


Then we can use the Lagrange function 


L=T—U 
for calculations with the equations 
d dL ƏL 
dt axk  ðxk 


Several applications have been discussed and exemplified for these methods. With 
the canonical momentum 


OL 


Pe = Oak 


2.3 Lagrangian Mechanics 121 


which is conjugate to x*, we may also write p, = OL/dx*,orp = VL withp = VL. 
This canonical momentum is to be distinguished from the mechanical momentum 
my, e.g., p = mv + qA holds if the vector potential A acts on the electric charge q 
(the curl of A is the magnetic field B). If L does not explicitly depend on time, then 
Xg pxx* — L is a constant of the motion. Furthermore, the conjugate momenta are 
conserved for all cyclic variables, i.e., for those x* that do not appear in L, pg does 
not depend on time. 

We have investigated examples of various oscillations (harmonic, anharmonic, 
damped, forced, and coupled). Note that, while the solutions of linear differential 
equations change continuously with the initial conditions, this is different for non- 
linear ones, as illustrated by the example of the (anharmonic) pendulum near the 
separatrix. 


2.4 Hamiltonian Mechanics 


2.4.1 Hamilton Function and Hamiltonian Equations 


According to the last section, when a (generalized) potential is given, we may always 
start from the Euler-Lagrange equation 


d aL aL 


dt ax Ox 


Here x stands for an arbitrary generalized position coordinate x* and x for its velocity 
x*, But in Sect. 2.3.5, it already turned out that, instead of the velocity x, it is often 
better to consider the canonical momentum 


ðL . L 
—— = 


P= Jg P= Jy 


conjugate to x. From now on, instead of the velocity x, we shall always take this 
momentum p as an independent variable and investigate everything in the phase 
space (x, p), as we have already done for the pendulum orbits in Fig. 2.16 (see 
Fig. 2.28). Here we may still gauge arbitrarily—only then does the canonical momen- 
tum depend uniquely on the velocity. This greater freedom is occasionally of use and 
often also provides a deeper understanding of the interrelations. 

The new variable p is the derivative of L with respect to the variable x (hereafter 
x will be replaced by p). Therefore, a Legendre transformation is necessary. Instead 
of the Lagrange function L(t, x, x) with 


122 2 Classical Mechanics 


Fig. 2.28 Representation of a harmonic oscillation in the (two-dimensional) phase space (with 
convenient scales for the x- and the p-coordinate—otherwise we obtain an ellipse). The points (e) 
are traversed clockwise. The phase-space units may not be arbitrarily small, according to quantum 
physics—otherwise there would be a contradiction with Heisenberg’s uncertainty relation 


we have to take the Hamilton function? H(t, x, p) with 


0H 0H 0H . 
= dt + dx + dp and H=px-L. 
or ox ap 


dH 


In particular, the last equation implies dH = xdp+ pdx — dL or dH =x dp — 
(0L/dt) dt — (dL/dx) dx. Comparing this expression with the one before, we then 
find 


oH ðL oH ƏL oH 
= = i =X. 
ot ot ox Ox ap 


’ 


We reformulate the middle relation with the Lagrange equation and find that, for the 
conjugate variables x* and pz, and with the Hamilton function 


H = X pit -L, 
k 


we obtain the Hamilton equations 


œ OH ; 0H 
x = —, (=-=. 
ap ae 


These are very general and we shall thus refer to them as the canonical equations. 
In Lagrangian mechanics, there is one differential equation of second order for each 
degree of freedom, whereas in Hamiltonian mechanics, there are always two differ- 
ential equations of first order. In addition, one has 


dH dH Do 0H OH ony =n 


dt ar ax Ape Opp axk at 


k 


5 William Rowan Hamilton (1805-1865). 


2.4 Hamiltonian Mechanics 123 


If further the Hamilton function H does not depend explicitly on time, then it remains 
a conserved quantity along all orbits. 

If there is a potential energy V (and hence also L = T — V), and if T is a homo- 
geneous function of second order in the velocities (so that px = 2T, according to 
p. 101), we have H = T + V, so H is an energy. But we shall find shortly that H 
and E may also be different. 

For a non-relativistic particle of mass m and charge q in an electromagnetic field, 
we infer the Hamilton function from the Lagrange function 


L=Sv-v—q(@-v-A) 


in Sect. 2.3.5 (p. 100) as H = p- v — L. To this end, we only have to express the 
velocity v in terms of the canonical momentum p = mv + q A (see p. 100). Since 
(mv + gA)-v-L=3v-v+q ®and v= (p — q A)/m this leads to 


WTA ngA y 


H(t, r, r)= 5 
m 


®. 


If the magnetic field B depends neither on time nor on position, then according to 
p. 100, we may use the vector potential A = +B x r with qB = —mø (see p. 78), 
where w is the associated cyclotron frequency. It then follows from mr = p — qA = 
p+ 5m @ x r or from r = V, H that 


In addition, for ® = 0, p = —V H (in agreement with p = $F = 3q v x Bonp. 100) 
delivers 
p=4ox(p+imoxr), 


and thus, 
p= im @oxr. 


We have already integrated these differential equations on p. 100. 

According to p. 98, for a gauge transformation ®' = ® + dW/dt,A’=A-—VW, 
the Lagrange function is transformed into L’ = L — dG/dt with G = q W, and the 
canonical momentum into p’ = p — VG (see p. 100). Since dG/dt = 0G/dt + 
VG - v, the Hamilton function is 


The term 0G/dt may depend on position and time—this is more than the arbitrariness 
in the choice of the zero energy. Therefore, the Hamilton function agrees with the 


124 2 Classical Mechanics 


energy only for an appropriate gauge. A more detailed investigation is available in 
[4]. The scalar potential P may not depend upon time! So only then can g®(r) be 
a potential energy V(r) and H — V a homogeneous function of second order in the 
velocity—consequently, H is an energy. If the electric field E depends on time, then 
this has to be included in the vector potential A, or more precisely, in its sources, 
because its curl determines the magnetic field B. For a time-dependent force, its path 
integral depends upon the amount of time needed to traverse this path. The force field 
is then not always curl-free and therefore cannot be derived from a potential energy. 

In the Lagrangian formalism, we find that px is a constant of the motion if L does 
not depend on x“, i.e., if x* is a cyclic coordinate. This leads to 0 = 0L/dx* = py = 
—8H/dx* in the Hamiltonian formalism: then x* does not appear in H. Hence, 
the conservation of momentum and angular momentum follows immediately for 
each system with only internal forces, for which H does not involve center-of-mass 
coordinates. 


2.4.2 Poisson Brackets 
The Poisson brackets for functions u(t, x, p) and v(t, x, p) are defined by 


du ðv du dv 
vl 2 OP OPE a 


and have the properties (with constant œ and £) 


[u, v] = — [v, u], 
[uv, w] = u [v, w] + [u,w] v, 
[æu + v, w] =a [u, w] + £ [v, w]. 


In addition, the Jacobi identity holds: 
[u, [v, w]] + [v, [w, u]] + [w, [u, v]] =0 , 
as for the vector product on p. 4. This is proved using du/dx = ux, du/dp = üp 


and similarly for v and w instead of u in Problem 2.43. The Hamilton equations 
lead to 
ðu ðu du ou 
, H] = (= cK 4 —— ais, 
[u, H] 2 ax T ap PH) at at 


and for arbitrary u, we deduce 


2.4 Hamiltonian Mechanics 125 


If u does not depend explicitly on time, then ù is equal to the Poisson bracket of u 
with the Hamilton function H. In particular, we obtain 


àt = [x*, H], p= [pr H], 


instead of the Hamilton equations. 
Since position and momentum coordinates do not depend on each other, we also 
have 


E E E E i q_i _ Ji fori=j, 
LOS ppi]: olor 0 fori As. 

These equations will play an important role for the transition to quantum mechan- 
ics, where the quantities will be replaced by (Hermitian) operators and the Poisson 
brackets by commutators (divided by if). Connections can also be found with these 
results in thermodynamics (statistical mechanics), namely, with the Liouville equa- 
tion. The latter gives the time dependence of the probability density p in phase space 
and states that do/dt = 0: 


dp dp 
— =0 — ,H]=0. 

dt = ot Pie 
Whatever is altered in a volume element of the phase space happens because of the 
equations of motion. This equation is proven in Sect. 2.4.4. With the probability 
density p, the mean values A of functions A(t, x, p) can be evaluated from A = 


fp Adxdp. 


2.4.3 Canonical Transformations 


We would now like to choose new coordinates in phase space (still for fixed time), 
and possibly also a new Hamilton function, such that the canonical equations are 
still valid. In the Lagrangian formalism, we only considered transformations in the 
configuration space, which has only half as many coordinates. For the moment we 
restrict ourselves to just one degree of freedom and leave out the index k. Then 
the Poisson bracket [u, v] is the same as the functional determinant (u, v)/0(x, p). 
Since 


Ə (u,v)  Ə(u,v) A(X’, p”) 
a, p) 3x, p) dG p)’ 


it only remains the same for transformations of the phase-space coordinates when 
the functional determinant of the new phase-space coordinates is equal to 1, viz., 


126 2 Classical Mechanics 


d(x’, p^) 


oa aey 


i.e., if the map is area-preserving. (If we no longer require the restriction f = 1, then 
this constraint is necessary, but not sufficient for canonical transformations. We shall 
deal with this later.) If we write 


ax’ Ox’ 


dx’\ _ dx : _ | dx Op 
(ay) =* o vin K=] ap! ap |’ 


ox ap 


then, because [x’, p’] = 1, for the inverse K~! of this 2 x 2 matrix given by the 
formula on p. 71, we have 


Ox Ox op’ ax’ 
-1_ | dx’ əm |_| ə ap 
K =| ap ap | = dp’ ax’ 
dx’ dp’ ox Ox 


The two matrices must have equal elements. This results in the four equations 


Here one alone actually suffices (e.g., the first), because the remaining ones follow 
from this one according to p. 44, in particular the second from 


Cole ec 


the third from 


(*). (4), = () (=), : 


and the fourth from 


(=) (#), =(%).(%),. 


This we generalize now to f > 1 for time-independent canonical transformations. 


With i,k € {1,..., f}, we obtain the following constraints: 
əx”  Əpk axl’ ax* 4 0 pi! OPK dpi’ — ax* 
= $ = i an = mas = ae 
axk — OD; OPK 0 pi! axk Oxi! OPK Oxi! 


2.4 Hamiltonian Mechanics 127 


Here for the first (and last) equation, the notation with upper and lower indices 
from Sect. 1.2.2 turns out to be quite successful, and the remaining equations follow 
therefrom. In fact, these equations ensure that 


oa” 3x” dp, ƏH dx! ƏH dH 
eki aa | oe = = 
j = Dla or Op ps) + Lae OP dp! a) tia i 


: Ope ., Ope’. dp, ƏH dx! ƏH dH 
— — — 
Pr = 2 ax! > H OPI bi) 7 Dla OPI + axk oat) xk * 


If, for a time-independent transformation, we have H’ = H, then the canonical equa- 
tions remain untouched. Therefore, the name canonical transformation makes sense. 
The linear transformation (with functional determinant 1) 


() = Fa |) (*) , with deta=1, 
P apx App P 


is clearly also canonical. In particular, we may choose axx = dp) = cosa and 
apx = —Axp = Sind, i.e., rotate in phase space. Therefore, the identity (with a = 0) 
is canonical, but so also is the transformation x’ = p, p' = —x (witha = 5m). 
This shows clearly that the meaning of position and momentum coordinates becomes 
blurred for the canonical equations—therefore q is often written preferentially for 
the generalized position coordinates rather than x. Moreover, the canonical transfor- 
mations are essentially more general than the point transformations which are the 
only ones allowed in the Lagrangian formalism, i.e., in the latter, only the coordinates 
could be chosen, but not the velocities. 
Let us consider the example of a linear harmonic oscillation with 

D? + mw?x? 


A(x, p) = m 


Here only the squares of x and p appear. Therefore, by a non-linear canonical trans- 
formation, a cyclic coordinate x’ can be introduced. We make a transition to polar 
coordinates, which are suggested according to Fig. 2.28: 
^ sin x’ Desk 
pee sg = ; p= f(p^ cosx’ =>? H' = f (p) ; 
mo 2m 


The transformation is only canonical if f(p’) obeys the constraint f df/dp’ = mw 
(since mw det K7! = f df/dp’). The associated differential form f df = mw dp’ 
is easily integrated: 


iP (p) = mop = H=op. 


128 2 Classical Mechanics 


No integration constant is added here because it would only move the zero energy. 
Now the Hamilton equations are very simple and are easily integrated: 


“7 0H 1 

X mey =o => x =w(t—to), 

y 0H’ ; H' 

p=- —0 => p = const = — 
Ox’ w 


If we write Eo instead of H’ for the total energy, then because f?(p') = 2m Eo and 
with the abbreviations p = ./2m Eo and x = p/(m«) for the original variables, we 
obtain 


x =X sin(w(t—t)) , p =P cos(w(t—t)) . 


As expected, we have had to integrate two differential equations of first order instead 
of one of second order. The integration constants Eo and fo can be adjusted to the 
initial values. 

For a charged point mass in a homogeneous magnetic field, we only search for the 
motion perpendicular to this field and, according to p. 123, the Hamilton function is 


(px — moy)? + (py + mox)? 


A(x, y, Px, Py) = z 
m 
We carry out the canonical transformation 
Jalpi, der 
2 mo ? x x 2 , 
o X D p mo 
y= Z mo’ Py = Px 2 y 


The proof that it is truly canonical is rather cumbersome at the present stage, because 
here there are four derivatives of the primed quantities with respect to the unprimed 
ones to be determined, and likewise many derivatives of the inverse functions, but 
at the end of Sect. 2.4.5, there is a generating function of this transformation, which 
simplifies the proof (see Problems 2.47-2.48). The Hamilton function now reads 
(px? + m?w*x”)/2m. The coordinates y’ and p,’ are cyclic, and we recognize the 
Hamilton function of a linear harmonic oscillation with the cyclotron frequency as 
angular frequency. The two cyclic coordinates are related to the pseudo-momentum 
K (treated on p. 100): 


K=p+ijqBxr=p-—jmoxr, 
whence K, = py’ and K, = —mwy’. It was introduced earlier as a conserved quantity 


and delivered the center of the circular orbit. Here it is also clear that K - K/2m 
belongs to a linear oscillation with the cyclotron frequency as the angular frequency. 


2.4 Hamiltonian Mechanics 129 


The angular momentum is given by 


= 


1 
a = (H 
z = XPy — YPx z Im 


We can thus split H’ into wL, and K - K/2m. 


2.4.4 Infinitesimal Canonical Transformations. Liouville 
Equation 


An infinitesimal canonical transformation is defined by 


r dg, p) pe dg(x, p) 
x =x+— E, p p- —.€é 
ap ox 


’ 


if ¢ is small enough to be able to neglect terms of the order of £? compared to 1 in 
the functional determinant, and thus use the fact that 3?g/ðp 0x = 07g/dx dp (for 
which g has to be twice continuously differentiable). In particular, also 


j ; 0H : . 0H 
x =x+xdt=x+ —dt, p=pt+pdt=p-—d, 
op ox 
is a canonical transformation: 


We can interpret the time evolution of the system as a canonical transformation. 


This yields Liouville’s theorem, regarding the time dependence of the probability 
density in phase space, thus of the weight with which each volume element of the 
phase space contributes to a statistical ensemble (e.g., for the molecules of an ideal 
gas—more on that in Sect. 6.2.3). In particular, the density has to have the prop- 
erty p'(t, x’, p’) dx’ dp’ = p(t, x, p) dx dp because, despite its motion, each phase- 
space element keeps its probability content. Since each canonical transformation is 
area-preserving, it follows that 


dp | 


=0, 
dt 


p(t, x’, p) =p(t,x, p) => 


and hence the Liouville (continuity) equation 


p 
= ,H]=0. 
Ta ] 


In equilibrium, p does not depend explicitly on time. Then [o, H] = 0. 


130 2 Classical Mechanics 


Table 2.1 Generators and infinitesimal transformations 


Generating function g Change Infinitesimal transformation 

H dt x =x+xdt, p'=p+t+pdt 
p dx x =x+dx, p =p 

Po do o =p+dp, P= Po 


The above function g(x, p) is usually called the generating function (generator) 
of the infinitesimal canonical transformation. In particular, the Hamilton function 
H generates a time shift, the momentum p a change in position, and the angular 
momentum pọ a rotation, as listed in Table 2.1. 

For Cartesian coordinates in the last row, the generating function L; = x py — 
y Px is to be taken. This delivers 


x'=x— ydg, p= Px — pydg, 

y=y+xdg, pi=py+prdg, 
as required for a rotation through the angle dg about the z-axis. Generally, we require 
as generating function the quantity canonically conjugate to the differential variable, 


so we also view the time f and the Hamilton function (energy) H as canonically 
conjugate to each other. 


2.4.5 Generating Functions 


Finite and time-dependent canonical transformations can also be derived from gen- 
erating functions. To this end, we start preferably from the gauge dependence of the 
Lagrange function (see p. 98), and L = px — H. Since L’ = L — G, we have 

dG = (L — L')dt = (H' — H)dt + pdx — p'dx’. 


If we now make the ansatz that G and x’ are functions of t, x, and p, we obtain 


dG dG dG 
dG = dt d dp, 
ot = ox is op j 
ax’ ax’ ax’ 
dx’ = dt d dp. 
ee T a A 
Therefore, we infer 
dG i , Ox" 
=H — H -p ; 
ot or 


2.4 Hamiltonian Mechanics 131 


3G ax 
a RE g 
aG ax! 
ə ap 


The transformation is canonical if the two mixed derivatives 


WG Si dp’ ax’ , 87x! 
ðpðx dp ox P dp əx’ 
3?G = Op’ ax’ , 8?x' 
əðxəðp əx ap p ax Op’ 


agree with each other, and likewise those of x’(t, x, p). Then, in particular, we have 


dx’ dp’ ax’ dp’ a(x’, p’) a. 
= = [x,p]=1. 
ox Op dp ox a(x, p) 


Thus x’, p’, and H’ have to obey the partial differential equations for G above 
(derivatives of G with respect to t, x, and p). In particular H’ = H holds if G and 
x’ do not depend explicitly on time. 

In the last section, we introduced generating functions g(x, p) for the infinitesimal 
transformations. We now ask how they are connected with G(x, p). Since 


ð ð 
x'=x+e È and p=p-e È, 
op Ox 


then up to terms of order e”, we have 


ge (ee Fe) 140 a ay (BP a) 


aG ( e 28), FB a r PE 1 ae) 
ap 5 ax/ dp? Pop ap BF i 


Therefore, we may take G(x, p) © e (g — p 0g/dp) and obtain a unique connection 
between G and g, whereupon both shall be referred to as generating functions. 

Likewise, we may also take x and p as functions of x’ and p’ or any other pair of 
old and new phase-space coordinates as functions of the other pair. However, different 
generating functions appear then. Later we will denote them by G and include the 
associated variables. So, with G (t, x’, p’), x(t, x’, p’), and p(t, x’, p’), for example, 
we have 


1G pee eee OG". ax 3G x 
a Papa iaa ae ap P ap 


Here, too, x and p result from partial differential equations. 


132 2 Classical Mechanics 


But if the generating function depends upon a primed and an unprimed variable 
(except for the time, which is not transformed, i.e.,t = t’), then even simpler algebraic 
equations follow instead of the (partial) differential equations. So we require 


dG dG 0G 
dG(t,x,x')= dt d dx’ , 
ee) ot + E 7 
because of the starting equation 
wen+?? __ 0G d ,_ 3G 
Sta? PS ay? PS Ox! 


If the mixed derivatives 0*G/dxdx’ and 0?G/dx'dx are equal, then it follows 
that dp/dx’ = —dp’/dx, whence the transformation is canonical if in addition 
p = 0G/dx can be solved for x’. Further generating functions follow from the Leg- 
endre transformations: 


G(t, x, x’) = G(t, x, p^) ~ px’ 
= G(t, p,x’)+ px 
=G(t,p,p)t+tpx—p x’. 


Actually, here we should use four different notations instead of just G, and these are 
often written G1, G2, G3, and G4, i.e., generating functions of type 1, type 2, type 
3, and type 4. However, only their variables are important. Each of these generating 
functions depends on one primed and one unprimed variable, except for the time. 
Thus we obtain the list in Table 2.2—in all these cases we also have 


with the other variables held fixed in each case. The remaining constraints for the 
canonical transformation are then also fulfilled, because one constraint already takes 
care of det K = 1. However, there are not always all four. Thus, the identity can be 
generated by G(x, p’) = xp’, for example, while this is not satisfied by the trans- 
formed function G(x, x’) = (x — x’) p’. 


Table 2.2 Different generating functions 


Generating function Fixed variables 

cent) |p. pa 
G(t, x, p’) p= a x = we 
G(t, p, x’) x= £, p= ve 
G(t, p, p’) x £ pa = 


2.4 Hamiltonian Mechanics 133 


For functions with several pairs of parameters, mixing is also allowed. Thus the 
generating function xı py’ + x2x7' leads to x)! = x1, py! = pı and x2’ = po, pr’ = 
—X. With the first pair, nothing is changed here, while for the second pair, position 
and momentum swap names. 


The canonical transformation x = /2p’/(mw) sin x', p = ./2mwp’ cos x’ (see 
p. 127) with a harmonic oscillation can be generated by the function 


rn _ mo 2 1 
GG@,x)= =x cotx’ , 


because it leads to p = max cot x’ and p' = mox’ sin™? x’. 


The following canonical transformation for a point charge (with mass m) in the 
homogeneous magnetic field can be derived from the generating function (Prob- 


lem 2.47) 


Px + Py’ ip Px — Py 


G ry T 9 y => 
(x Px» Py Py) x 7 y mi 


> 


whence it can be proven easily that the transformation mentioned on p. 128 is truly 
canonical. 


2.4.6 Transformations to Moving Reference Frames. 
Perturbation Theory 


An important application is transformations to moving reference frames. We inves- 
tigate in particular 


H = Ho(p)+ M(x, p), 


in which x is cyclic with respect to Ho, but not with respect to the total Hamilton 
function. For H; = 0, the condition ð H)/dx = 0 leads to constant p = po and 


. 3M 
k a 


— = Vo => x = vot + xo. 
Op |p=po 


With the generalized case Hı 4 0, we now take the canonical transformation 
x’ =x—vwt—xXx0, P =p- po, 
which can be derived from the generating function 
G (t, x, p°) = (x — vot — xo) (Po + p’), 


with p = 0G/dx and x’ = 0G/dp’. Since H' = H + 0G/dt, we have 


134 2 Classical Mechanics 
H' = Ho(po+p') + Hi (vot+xo+x', pot+p’) — vo (potp’) . 


These equations have been derived without approximations. 

But these are often useful also for perturbation theory, if one has the solution for 
Ho, but not for H. If we have | H;| < | Hol, then for not too long times, x’ and p’ 
will also be small compared to x and p, because they even vanish for Hı = 0. Here 
we may still choose xo such that, for t = 0, | H; | is as small as possible compared to 
| Ho|. The perturbation theory then works as follows. In 


0H’ 0H om 


x(t, x’, p) =+ = + vo, 
( p) ap! p agp 
0H’ o Hı 
'(t, A Fs aps e , 
PeR Ox’ ax! 


we first set x’ and p’ equal to 0 on the right, and thus find solutions to x’(t, 0, 0) and 
p’(t, 0, 0). Here the integration constant has to be fixed in such a way that x’ and p’ 
vanish for t = 0. With these approximations we can improve the expressions on the 
right of the differential equations and evaluate the next approximation, i.e., the next 
order in the Taylor expansion. Where possible, we may even be able to identify the 
complete solutions. 

If we consider as an example a harmonic oscillation and the free motion as unper- 
turbed (a coarse approximation, where here actually V = T holds), 


p? P 


delivers 


(po + n2 ma f 
= m2 + 5 (vot +x? — vo (po + p^) - 


H' 


With this and because of vo = ð Ho/ðp| pp = po/m, we have 


; +p ‘ ; 
a OP cya pm et 
m m 


and consequently p’ ~ — x pow’ t? and x' — x vow’ t?. The next order delivers the 


additional terms i pow't* for p’ and a vow*t> for x’. In fact, the correct solution is 
p= po cos(wt) = po+p', x = (vo/w) sin(wt) = vot +x’, 


with x(0) = x) = 0. 


2.4 Hamiltonian Mechanics 135 


2.4.7 Hamilton-Jacobi Theory 


The Hamilton-Jacobi theory is a further application of time-dependent canonical 
transformations and will be explained briefly here. Note that, in his book (see the 
suggestions for textbooks on p. 162), H. Goldstein devotes a whole chapter to this 
subject. Unfortunately, he, along with many others, does not comply with the [UPAP 
recommendations: the quantities W (action function) and S (characteristic function) 
are used by him in the opposite notation S and W, respectively. 

In this theory the Hamilton function is transformed canonically to zero. Then all 
new variables x’ and p’ are conserved quantities, fixed by the initial values. Here, the 
generating function is the associated Hamilton action function W(t, x, p'). Because 
H'(t, x', p) = H(t, x, p) + OW(t, x, p’)/dt for H’ = 0 and because p = dW/0dx, 
W has to satisfy the Hamilton—Jacobi differential equation 


Since here p’ does not depend on time, we have 


dw ow ow, ~ HEL w Ld 

ap a a ~ == =| ea 

The integration constant is left out here, because we may still find a suitable gauge. 
The single partial differential equation of Hamilton and Jacobi replaces all f pairs 
of ordinary differential equations in the Hamilton theory! However, it is difficult 
to solve, because the momenta in the Hamilton function and hence the required 
functions mostly appear squared. But the theory is useful for formal considerations. 
Using this we shall be able to discover in particular a connection with geometrical 
optics (ray optics). Note that we have so far expressed all laws as differential equations 
and taken, e.g., the Lagrange function L as the quantity to start from. Now L is the 
derivative of the “anti-derivative” W, so the action has to be viewed as the original 
quantity. 

The choice of the new momenta p’ is not unique. Functions of it are also allow- 
able, and we shall choose their structure to be as simple as possible. Of course, the 
associated coordinates x’ = dW/dp’ depend upon this choice. In any case, x’ and 
p’ are constants of the motion, which have to be adjusted to the initial values. After 
that, x(t, x’, p’) and p(t, x’, p’) can be obtained. 

If the Hamilton function does not depend on time, the ansatz 


W(t, x, p) = S(x, p')-— Et 


suffices, since it leads from p = 0W/dx to p = 0S/dx and from the Hamilton— 
Jacobi equation to 


136 2 Classical Mechanics 


a(x, $) =E, 


and H should also be taken as energy. Since S depends only on x and p’, it is 
sometimes called the reduced action, but usually the characteristic function. This 
can be concluded from 0S/0x = p and leads to a sheet in phase space: 


Í 
s= f pax. or s= f att, with f >1. 
k=1 


Again, the integration constant vanishes here for a suitable gauge. For periodic 
motions (oscillations or rotations), we also introduce the phase integral (sometimes 
called the action variable), taken along the closed path in phase space, viz., 


i= pax, 


or several action variables J; for more periodic degrees of freedom. According to 
quantum theory (Bohr—Sommerfeld quantization rule), this quantity cannot change 
continuously, but only in steps of the action quantum h (see also p. 367). 

We may take one of the new momenta p, as energy. Then the associated coordinate 
x*’ is connected to the choice of the zero time, as we show now for a simple example. 
If a coordinate oscillates harmonically, then H = (P? + m?w*x7)/2m leads to 


1 ðS? m 4 5 

— (2) +aia x =E. 

2m \ðx 

From this we could immediately conclude S = f (3S/3x) dx by integration, with the 


result S = įmo x./2E/mo? — x? + E/w arcsin(/m@?/2E x). But this is unnec- 


essary, since with x’ = W /ðp' and p’ = E, we can also immediately obtain 


x t, 


~ ow as a J as 
= = = dx 
OE OE dE J ox 


and hence then, with x’ = —tọ, 


1 f dx 1 `. [mo 
t—t= = —arcsin,/ ——x. 
oJ /2E/ma2—x2 w 2E 


This is the solution x = X sin[æ (t—to)] withamplitudex = y2p'/mæ? and the sec- 
ond adjusted parameter fọ = —x’ mentioned on p. 128. Note that, inserting the solu- 
tion x(t) into the expression for S, we can also obtain W(t) = E/ (2w) sin[2w¢ — 
to)] and dW/dt = L, implying J = ET = 2x E/w for the phase integral. 


2.4 Hamiltonian Mechanics 137 


In order to understand the properties of W and S, we have to start from a time- 
independent Hamilton function. At time zero, W and S agree. If, in configuration 
space (i.e., the space of coordinates x ), we investigate the areas of constant W values 
or S values as functions of time, then the sheets of the S values stay constant, while the 
sheets of constant W values move like a wave front. The latter follows in particular 
from dW/dt = 0, thus VW .x— E = Oorp-x= E. The larger the momentum p, 
the smaller the velocity of the wave for given energy. 

In order to understand what kind of wave this is, we consider the wave equation 


1 ey 
av ce atz 2i 

where c is the phase velocity of the wave, as can be seen from the ansatz for the 
solution y « exp{i(k -r — wt)}, which contains the wave vector k with k = 27 /àÀ 
and the angular frequency w = 27r / T , where À is the wavelength and T the oscillation 
period of the wave. For the differential equation to be satisfied, ck = w or c = 4/T 
has to hold. In an inhomogeneous medium, the wavelength depends on the position, 
and so also does the phase velocity. For this notion of a wave to make sense at all, 
we would like to assume that both vary only slowly on their paths. We thus restrict 
ourselves to waves of very short wavelength or very high wave number k and call 
the smallest of the occurring wave numbers kọ. Then we can make an ansatz 


Y = exp{A(r) + iko (S(r) — cot)} 


for the solution of the wave equation in the inhomogeneous medium with co = 
w/ko, real amplitude exp[A(r)], and real path eikonal S(r). (The word eikonal is 
reminiscent of the Greek ek wv, meaning picture or icon. With the mapping of an 
object point rọ on the image point rı, both points are singular points of the wave 
areas, and the optical paths for all connecting rays are equal to S(r1) — S(ro). The 
eikonal is related to the characteristic function, as we shall see soon.) In particular, 
this ansatz leads to Vy = y V(A + ikoS) and 


Ay = y {A(A + ikoS) + V(A 4 ikoS) - V(A + ik S)} , 


which, according to the wave equation, should agree with —(coko/ o? w. Then, after 
separation into real and imaginary parts, we infer 


AA+VA-VA+k? (nr? —VS-VS)=0, 
with the position-dependent refractive index n = co/c and 
AS+2VA-VS=0. 
The refractive index should barely vary, according to the assumption about the wave- 


length: ko should be sufficiently large. With this we obtain the eikonal equation of 
geometrical optics, viz., 


138 2 Classical Mechanics 


Fig. 2.29 Geometrical optics and classical mechanics (beam path and particle path) have much in 
common, here shown for a lens with refractive index n = 2. But note that the refractive index for 
a wave corresponds to the ratio co/c, in contrast to the ratio v/vo for a particles. Actually, we have 
to distinguish between phase and particle velocity. Dashed lines are the wave fronts. Those of W 
move in the course of time, but not those of S. The wave fronts are singular at the object and image 
points (e) 


VS:-VS=n’, 


an inhomogeneous differential equation of first order and second degree. (It holds 
only in the limit of short wavelengths, because otherwise we would also have to take 
into account AA + VA- VA = 0: VA would have to have only drains, because its 
source density would be V- VA = AA = —VA.-VA <0.) If we integrate to find 
the eikonal S(r), then from the second differential equation AS + 2VA- VS = 0, 
we obtain the gradient of the amplitude function A in the direction of the gradient 
of S. Perpendicular to it, the gradient of A remains undetermined. In this plane it 
may even vary in steps, whence, according to geometrical optics, rays are possible. 
The wave propagates along V S, (see Fig. 1.4) perpendicular to the wave fronts S = 
const. (see Fig. 2.29). 
With the Hamilton—Jacobi equation for H = ~ p- p+ V(r), we arrive at 


VS-VS=2m{E—-V(@}, 


and hence also at the eikonal equation with n? = 2m {E — V(r)}, which however 
is not a pure number, and where the “characteristic function” appears instead of the 
eikonal. Classical mechanics can describe the motion of particles of mass m with the 
same differential equation as geometrical optics. This holds for waves of negligible 
wavelength. Conversely, the propagation of light can be viewed as the motion of 
particles (photons), as long as the wavelength is sufficiently small. 


2.4 Hamiltonian Mechanics 139 


2.4.8 Integral Principles 


So far we have derived the basic laws from differential equations, e.g., from the 
Lagrange equations of the second kind 


“(22 ) = OL 
dt \axk) ~~ axk 7 


However, for the problem under consideration, there has to be a potential energy, 
or at least a generalized U. But these differential equations can also be related to 
integral expressions via the variational calculus. Then there is no need for a potential 
energy, and the basic laws can be interpreted a different way. This is also important 
for our general understanding. 

In the variational calculus, we seek functions x(t) that make an integral 


ti 
D f(t, x, x) dt 
to 


extremal under constraints. Here the boundaries tọ and tı are given as fixed, or at 
least connected to constraints that deliver fixed boundaries after a transformation 
t — t'. The values of the function are also given at those boundaries, viz., 6x (to) = 
0 = ôx (tı), but not their derivatives x (to) and x(t). 

If we search for the “extremal” x (t) for the regime between fp and 1, then initially, 
in addition to x, we also have to allow for x + 6x and hence, in addition to x, also 
for x + 6x. Here, to begin with, the variations always refer to the same time: ôt = 0 
(see Fig. 2.30). Consequently, we have 6x = 6 dx/dt = d 6x/dt, and therefore (with 


partial integration for the second equation) 
"79 af dé a " faf dð 

s= f (La "Yar =f ax +f -SE bear. 
to i t) (Ox dt \dx 


Ox ox dt Ox 


Now, for 6/ to vanish for arbitrary òx, 


t 


Fig. 2.30 Path variation with ôt = 0 but 6x 4 0 along dashed lines. Since 6x (to) = 0 = 6x(t), 
each permitted orbit ends at the points shown by the dots (e). Since tı may follow arbitrarily quickly 
after fo, X(to) and x(t1) effectively vary 


140 2 Classical Mechanics 
ti 
ôl=0 at õ&t=0 4> f òra=o0 at 6¢=0, 
to 


whence (uniquely) we must satisfy Euler’s differential equation 


d k h af _y 
dt\ax/ ax 
Correspondingly, for f(t,x!,...,x/,x!,...,%/), one of the extremal conditions 


delivers a total of f such differential equations of second order. 
From the Lagrange equations of the second kind, it follows that the action function 
W introduced on p. 135 takes an extremum, yielding Hamilton’s principle: 


ti 
sw =a f Ldt=0, at 6¢=0. 


to 


Among all possible paths the one with extremal W is realized. We usually replace 
L by T — V. But Hamilton’s principle holds even if there is no potential energy at 
all. This can be understood with d’ Alembert’s principle (mv — F) - òr = 0, implying 
that F - 6r = 6A and v- Ôr = d (v- 6r)/dt — v- dr hold with mv - r = ST. Since 
v - or vanishes at the integration limits, we thus obtain f (T + 5A) dt = 0. This we 
may also write as (general Hamilton principle) 


ti ti 
af ras f dAdt=0, at 6r=0. 
to to 


Note that the virtual work 6A makes sense, but the work A does not generally as 
such. Only if a potential energy V produces the (external) forces do we have 


67 +6A =8(T-—V)=S8L, 


and then the variation can be moved in front of the integral. 

Hamilton’s principle does not depend on the choice of coordinates. Arbitrary 
(unique) transformations of £ and of the generalized coordinates x* are permitted. 
We only need to be able to give T and V or, respectively, 5A. With this, we have 
a general basis for the problems of mechanics, and even for friction. If there is a 
potential energy and hence also a Lagrange function, then from the same principle 
we can immediately conclude that 


; dG 
L = L- — 
dt 
is also an allowable Lagrange function (gauge invariance, see p. 98). 
Another integral principle is the action principle (due to Maupertuis, Leibniz, 
Euler, Lagrange), for which, however, it is not the action W that is varied, but the 


2.4 Hamiltonian Mechanics 141 


characteristic function (reduced action) S, and where the energy is held fixed instead 
of the time (and likewise the integration limits rp and rj). In addition, the Hamilton 
function need not depend on time for S to be formed. In particular, $S = Et + W 
with E = —dW/dt, and therefore 6S = t ÕE + E dt + W. Then from 6W = 0 for 
dt = 0, we have 6S = 0 for SE = 0: 


6S = Sf v -dr=0, for E =0. 
The action principle is often written in the form 
ti 
5 | 2Tdt=0, for 6E=0. 


In fact, dS is not only equal to pdx, but also to 2T dt, because for dE = 0, we 
can derive dS = 2T dt from dS = dW + Edt and dW = L dt with L = T — V = 
2T — E. However, we must remember here that the integration limits will now also 
be varied, because times of different lengths are necessary for the different paths 
between rp and rj, if the kinetic energy is determined by a potential energy. 

For a force-free motion neither T nor V is altered, and thus 


ti 
5 | dt = d{t; — to} =0, with constant T and V . 


This principle of least time due to Fermat had already been applied by Hero of 
Alexandria to the refraction of light. (It could also be a principle of latest arrival, 
because we only search for an extremum with the variational calculus. Therefore, I 
have also avoided the name principle of least action for the action principle.) Here 
the position coordinates are missing in the Hamilton function, e.g., S = p- x. With 
the characteristic function and for the action function, each cyclic coordinate x leads 
to a term px, which comprises the whole x-dependence! 

So far, for all transformations, the time ż has not been altered, but treated as 
an invariant parameter. If we had altered it in addition to the position and momen- 
tum coordinates, then we would have had to keep fixed another parameter t in the 
variation—some parameter has to mark the progress along the path. Then since 


a generalized Hamilton principle has the form 


u, xk dt 
5f È n=- H )dr=0, with ôr = 0. 


142 2 Classical Mechanics 


This suggests taking ¢ as a further coordinate x? and — H as its conjugate momentum: 
8} > A irao, viibis 
Pk Jr =U, =0. 


After a canonical transformation here, p, and x’* would appear, with — po as the new 
Hamilton function and x” the new time. With a generating function G(x*, pi), we 
obtain f +1 pairs of equations 


iG a aG 
Ta ~ OP} 
k 


, for ke{0,..., f}. 


These more general equations are only necessary for time-dependent Hamilton 
functions. As an example of this, we consider the time-dependent oscillator, in 
Sect. 2.4.11. 


2.4.9 Motion in a Central Field 


For a central field the angular momentum is conserved. We may restrict ourselves to a 
plane orbit with polar coordinates r and g. According to p. 97, since p, = 0L/dr = 
mr and pọ = 0L/d~ = mr’¢, we obtain for the kinetic energy 


m 1 Po 
T=” Pit es = (p? C). 
z Č ee) 2m \P ae 


Since g does not appear in L = T — V(r), the component of the angular momentum 
perpendicular to the plane of motion, pg, is a constant of the motion. Since the energy 
E is also conserved, conservation of energy can be used: 


Po 
mr? ` 


2 
P=-le-v~-Fe}, ġ= 
m mr 


The last term inside the curly brackets comes from the centrifugal force. Part of 
the energy appears because of the centrifugal potential as rotational energy. In the 
ordinary differential equation r = f(r), the variables can be separated and then 
integrated: 


m dr 


J2m [E — V(r)} — (pp /r)? © 


t— t= 


Hence t(r) or r (t) can be obtained. Then the last expression for ġ no longer contains 
any unknown term. This equation supplies the area—velocity law: r°ġ = (r x v) - 
n = p,/m. The integration constants are E, py, ro, and go. 


2.4 Hamiltonian Mechanics 143 


In many cases, we desire only the equation r (9) of the orbit. Then we use 


of = fomte-VO)—@,/r? 
dp ġ ce 


and separate again in terms of variables. If the radicand vanishes, we have to expect a 
circular orbit, since then = 0 and thusr = ro andg = pyt/ (mro?) + go (if Po #9 
and rọ > 0). 

The Hamilton-Jacobi equation for this problem reads 


ow 1 TSJ 1 = 


dt  2m\\ðr dy J] EIUS: 


r2 


Since ¢ and ¢ do not occur in H, we may set W = S(r) + pyg — Et, and from the 
last differential equation, we obtain 


S= [vom [E —V(r)} — (pp/r)2 dr . 


This expression also delivers the orbit equation, because it yields y' = OW/dp, = 
dS/dpy + p. According to this, r and ¢ are then related, as we have found before 
from dr/dg: 


= = Po dr 
ee | P fim{E-VOl— Ole 


Likewise, we could also have arrived immediately at —t = OW/d0E = 0S/0E — t. 
From the beginning we have only considered plane orbits. If this plane is still 
unknown, then spherical coordinates are suggested. Then we have 


m, ; y ; 1 po’ Po 
T = — (F? +r? Ẹ? +r’ sin? 0 ġ? =—( oe ees a Jy 
z í aa ae eae 


with pọ = mr? 6 and (the new) Po = mr? sin? 6 ġ. With W = S — Et, this leads to 
the the Hamilton—Jacobi equation 


male) +72 Ga) rears (aq) [HPO 


Since g does not appear here, we have a conserved quantity 


aS 


ag PF 


144 2 Classical Mechanics 


in addition to the energy E. For a central force, each component of the angular 
momentum is conserved, thus also the square of the angular momentum, which we 


denote here as 
aS\2 1 aS\2 
2o 
Peg = (39) tap ap) f 


From this we conclude 


1 (3SN? Pog 
+ Pe levee, 
2m | ( ðr ) r? 0) 

Here, pọ is no longer of interest, but only the conserved quantities po, and E. For 
central forces there is a degeneracy, because different py lead to the same Dig The 
last equation once again delivers the above-mentioned expression for 7, since 


o ðw as | Po? 
p= fone- vo] -2 


is equal to mr. 


2.4.10 Heavy Symmetrical Top and Spherical Pendulum 


If the center of mass of a pendulum moves on a spherical surface, we have a spherical 
pendulum—or even a heavy top, if the body rotates about the axis connecting the 
hinge and the center of mass. The spherical pendulum is not much simpler to treat 
than the heavy top, and clearly a special case of the top, which we would like to deal 
with anyway. 

If the center of mass does not lie on the vertical through the rotational point, 
the gravitational force exerts a torque and changes the angular momentum along 
the horizontal direction. Hence, consideration of the “free” top in Sect. 2.2.11 is 
no longer adequate. The kinetic energy of the top reads most simply in Cartesian 
coordinates along the principal axes of the moment of inertia fixed in the body: 


T= 5 (iao? + ban? + boa) A 


On the other hand, the Euler angles are suitable coordinates to describe the motion 
in space. Therefore we express œ using the Euler angles and their derivatives with 
respect to time. 

In the body-fixed system, the space-fixed z-axis has polar angles 6 and m — y 
(see Fig. 1.10). Therefore, for a rotational vector proportional to &, it follows that 
(Problem 2.4) 


2.4 Hamiltonian Mechanics 145 
@, = a {sin B (—cosy e; + sin y e2) + cos B e3} . 
Correspondingly, @g = Ê (siny e1 + cos y e2) and œ, = y e3, whence 
w = —a sin Bcosy + B siny , 
w= & sinf siny +ß cosy, 


w3 = a cos B +y : 


Hence we have w? + œ? = & sin? B + B?. Since with s as the distance of the 
center of mass from the rotational point, the potential energy is 


V = mgs cos Ê , 


we shall restrict in the following to a symmetrical top (J; = h) or a symmetric 
pendulum. Then, since 


T = $ l & sin’ p +Å’) +i h (à cosp+y)’, 
a and y are cyclic coordinates, 0H /da = 0 = 0H/dy, and thus the associated gen- 


eralized momenta—the angular-momentum components along the lab-fixed and the 
body-fixed z-axes—are constants of the motion: 


ae (a cos + y) = const., 

y 
EN. 

Pam og ~ a sin B+ py cos B = const. 
a 


(If p, = 0, then we have a spherical pendulum instead of the top—for the plane 
pendulum, py = 0 also holds.) Only pg = dL/ ap = 1, 8 still depends on time. 
But this is therefore a one-dimensional problem, which we simply solve using the 
conservation of energy—then we avoid a differential equation of second order: 


1 Pa — Py cos B\?2 p 2 
ee aE a 
zn e t sin B Tope ee 


is a constant of the motion. Hence we now have to determine £ (t). The expression 
for Py leads to a linear differential equation of first order for a(t), and the expression 
for p, to a similar equation for y (t). 

In order to avoid the transcendent circular functions, we set 


cos B =z => B= 


and then obtain 


146 2 Classical Mechanics 


snT dnt 


*> 
caor ~-r 


-1 


Fig. 2.31 The three Jacobi elliptic functions sn(T|k?) (continuous red), cn(t|k?) (dashed blue), 
and dn(t|k2) (dotted black) for the parameter k? = 5: Compare also with Fig. 2.18 


2 2 
AL wo 2 Py (Pa — Py?) _ 
z=(1-z (H JA mgs z) ZT, = mgs f(z). 


Here, f(z) is a polynomial of third order in z, which is important for us only in 
the regime —1 < z < 1, and there also only for f(z) > 0. Now f(z) is positive for 
z > 1 and negative for z = +1 (or zero in the special case of a top with perpendic- 
ular axis of rotation and therefore without torque). Thus only the two lower zeros 
of f(z) are relevant here. The differential equation can be solved with the Jacobi 
function sn(t|k”) mentioned on p. 105. For this as for the other elliptic functions, 
it is customary (see, e.g., [1]) to number the zeros z; of the polynomials in order of 
decreasing value, viz., z1 > z2 > z3. The zero time can be chosen as the integration 


constant: 
mgs £2 —Z3 
z(t) = 23+ (2-2) s? ( JTE eiz) 0-0) | ==). 
2h Z1— £3 


The derivative of sn t is equal to the product of the Jacobi elliptic functions cosinus 
amplitudinis cn t and delta amplitudinis dn t (see Fig. 2.31): 


cn(t|k?) = cos(am(t|k7)) , 


dn(t|k?) = y1 — k? sn2(t|k?) . 


Consequently, in addition to sn(t|k?) = sin(am(t|k2)) and sn’ (t |k?) = cn(t |k?) - 
dn(t |k*), we have 


1 — dn?(t|k?) 


sn? (t|k*) = 1 — cn? (t |k?) = = 


The above-mentioned expression z(t) therefore satisfies the original differential 
equation 2? = (z — z1)(z — z2)(z — z3) 2mgs / I for z3 < z < z2 < 222). The figure 
axis of the heavy top thus tumbles back and forth between two circles of latitude 


2.4 Hamiltonian Mechanics 147 


SNES = 


Fig. 2.32 Orbits of the body axis of a heavy symmetric top (red line). Left: With loops. Centre: 
With peaks. Right: With simple passes. Dashed blue lines are the limiting circles of latitude of the 
intersections of the figure axis on the sphere 


B2,3 = arccos 22,3 (with 2 < p3). For the first return to the old circle of latitude, half 
an “oscillation” is performed. Thus the oscillation period is 


T=4 h T dz =) 2h; 2 k(==) 
2mgs Jz, Vf (2) mgs /z—z3  \z1—-23/ — 


As with the plane pendulum (see p. 104), we thus arrive at a complete elliptic integral 
K, however, we still have to determine the three solutions z; (see Fig. 2.32). 

For the tumbling motion, there are simple passes, but also loops or peaks. This can 
be read off from the zeros of 1;}@ = (pa — py cos B)/ sin? B, which are determined 
by Pa — Py Z: for z3 < Pa/ Py < Z2, there are loops, for pa / p, equal to z3 or z2, there 
are peaks, and otherwise (with py/p, < 23 OF Pa/ Py > z2), neither loops nor peaks. 
This clearly holds also for the force-free top (with mgs = 0), which was already dealt 
with in Sect. 2.2.12. 

Peaks occur, e.g., for the frequent initial condition a@(0) = B (0) = 0, for motions 
with an energy as small as possible, because &(0) = 0 delivers z(0) = pa/ py, and 
since B(O) = 0, z also vanishes initially and hence so does f(z). We thus start from 
one of the limiting circles of latitude with a peak. In fact, the nutation starts from the 
upper circle of latitude (z2), because there the potential energy is highest, whence 
the kinetic energy is lowest. For these initial conditions, we already know the zero 
z2 of f(z), viz., 


1 2 
Z2 = Pos => (H Py ) ’ 
Py mgs 21; 


and can determine the other zero z3 more easily from a second-order equation, 


because 


Py? 


2h 


mgs f(z) = 2-2) (mgs A-3 — S @-0a) 


148 2 Classical Mechanics 
delivers mgs (1—z3") = [p,?/(2h)] (Z2—z3). For a fast top, Py’ /2b) > mgs 


holds. If now J; is not very much greater than J3, then because 0 < z3 < l, it 
follows that z3 ~% z2. Therefore, we obtain 


sin” B(0) , 


mgs 
22° za 
Py?/(2N) 
i.e., the faster the top rotates, the less its nutation. It can also happen that the two 
circles of latitude coincide—then z and hence # are constant, as are @ and ý, and 
we have regular precession. For very small nutation compared to the precession, we 
speak of pseudo-regular precession. 

The differential equation J;)@ = (Py — pyz)/U — z?) for the Euler angles œ can be 
reformulated in the following way using @ = (da/dz) z = /2mgs/I,/ f (z) da/dz: 


da _ py? /2h) bel by — ( 1 En I ) 
dz mgs 2/f@ \ltz 1-z/)’ 


with f(z) = (z—z1)(z—z2)(z—z3) and zı > z2 > z > z3. The solution of this dif- 
ferential equation can be given with the help of the incomplete elliptic integral of the 
third kind 


g dy 
TI (n; 9 |k? =f 
RE 0 (1— nsin? y)y1 -— k? sin? y 


-f dt 
Jo = n?)f0— 0 — BP)’ 


and with the integral of the first kind F( |k?) from Sect. 2.3.6. With the abbreviations 


i ae G= 
g(z) = CEN and oa ; 
Z2 — 23 Z1 — 23 
both with values between 0 and 1, we have in particular 


[= a = 5 [e (2 arcsin g (z) 
3 P-t SiO vzı=z3 (P-z3 P— z3 


; 
e)}. 


Therefore, after an oscillation period T, the body axis does not return to the initial 
point, in contrast to what happens with the plane pendulum, but precesses about the 
angle 


+ P( arcsin g(z) 


2.4 Hamiltonian Mechanics 149 


Fig. 2.33 Complete elliptic 
integrals of the first kind 
K(k?) (continuous green), of 
the second kind E(k?) 
(continuous red), and of the 
third kind M (n | k?) (dashed 
black), where n changes in 
steps of 1/4 (top 3/4, bottom 
—1). We also have 

TT(O| k*) =K(k?) 


2 
Ngä p,/@2h) | 1+ Pa/Py n( Z2 — Z3 ; T e) 
mgs (zı — z3) 1+z3 l+z° 2 
_1= Pal Py n(+2=2 1 P): 
1- z3 1-73 2 


Due to the argument ir here, complete elliptic integrals of the third kind occur, 
written for short TI (n | k?) (see Fig. 2.33). 


2.4.11 Canonical Transformation of Time-Dependent 
Oscillators 


The time-dependent oscillator investigated in Sect. 2.3.10 offers an instructive exam- 
ple of how a canonical transformation can transform a time-dependent Hamilton 
function into a time-independent one. 

According to Floquet, Hill’s differential equation ¥+ f(t)x=0 with 
f(t+T)= f(t) also has quasi-periodic solutions xp(t) = y(t) expG@t/T) with 
y(t + T) = y(t). Here, ¢ is real for stable solutions, to which we would like to restrict 
ourselves here, even if then not all periodic functions f(t) are allowed. We now take 
Xp and xp* as the fundamental system and set w = (Xp Xp* — xp Xf) /(2i) = w* > 0. 
(It will turn out that w corresponds to an angular frequency. The similarity with w is 
intended. For w < 0, we have to swap xp <> xp*.) The value w does not depend on 
t, because it is the Wronski determinant of the two solutions, except for the factor 21 
in the denominator. Two real fundamental solutions are often taken, which behave 
for t © 0 like the circular functions cos(wr) and sin(wt). Here we prefer exp(+iwr) 
fort + 0. 


150 2 Classical Mechanics 


0 
0 0,5 1 ¢/T 0 0,5 1 ¢/T 


Fig. 2.34 Solutions of the Mathieu differential equation ¥ = tq? cos Qt x for q = 1/4 (dotted 
black), q = 2/4 (continuous red), and q = 3/4 (dashed blue). Amplitude A (left) and phase g 
(right) of the Floquet solutions as a function of t/ T. The amplitude has period T, while the phase 
increases by ġ during this time 


In the following, it will be useful to set x = A exp(ig) with real functions A(t) 
and g(t). From Hill’s differential equation, we then have the two equations 


2 


ws Ww g 
Ar AS a and @ 


W 


Here the quasi-periodicity of the Floquet solution xp also delivers 
Art +T)= Art) and gpt+T) =r) +o. 


The amplitude Ar is thus strictly periodic, while the phase gp increases by ¢ with 
each period T. Note that pọ > 0 holds because ġp = w/A? > 0. 

In the following, we leave out the index F and choose as initial conditions A(0) = 
1, A(0) = 0, and g(0) = 0. Then w is also uniquely determined. 

As an important example we consider the Mathieu differential equation. As in 
Sect. 2.3.10, x + f(t) x = 0 with f(t) = 19? (a — 2q cos Qt). Figure 2.34 shows 
the amplitude and phase of the Floquet solutions, and Fig. 2.27 its real part. Since 
the amplitude is periodic, it can be expanded in a Fourier series. We consider now 


r w dt’ 


[o,@) 
A(t) = bn Qt => t)= =<, 
(t) = J by cos(n2r) OO) = | S count 


n=0 


since its Fourier coefficients converge quickly to 0 as q”/(n!)? and can be deter- 
mined from a recursion relation. (This is shown in [5].) The Wronski determinant w 
becomes imaginary at the stability limits. Note that, in the unstable region, the same 
recursion relation holds for an expansion A? = >>, bn cosh(Qr). The phase ¢ fol- 
lows numerically from the above-mentioned integral expression using the Simpson 
method. 


2.4 Hamiltonian Mechanics 151 


If we now take the generating function G (t, p, x!) = —A p x' + 5m AA x’? (thus 
with x = —dG/dp = Ax’ and p' = —8G/dx' = Ap — m A Ax’), then from H = 


z p’ + % f x?, we have 


KR ae +2 (A+ fA) Ax, 


H'’=H+ = 
ðt 2m A? 2 


since 0G/dt = —Åpx' + 5m(A?+AA)x°*. For t = 0, we should have x’ = x and 
p' = p, thus A(O) = 1 and A(0) = 0. Because A + f A=w’/A?, we arrive at 


I 72D: 2 
H=, with fs! po P 
A2 2m 2 


and because Í = U, H'] =, 7] w/ A = 0, J does not depend upon f¢ and is thus 
an invariant. Since w/A? = dg/df, it is clearly appropriate here to use the phase 
instead of the time. For each observable B not explicitly depending on time, we then 
have 

dB do dB 


— = [B, 1] — — = [8,1]. 
g PUp = gn 


In order to determine the function B(g), we therefore only need to know the invariant 
I. In particular, the position and momentum can then be determined. (Neither g nor 
I nor H’ depend on the choice of scale for w: for A > cA’, we have in particular 
w > c*w, x’ > c7!x’, and p' > cp’) 

The invariant J does indeed help for the computation of the time dependence (of, 
e.g., position and momentum), because H’ = Jw/A? is a Hamilton function, but H’ 
is not an energy. For this, the gauge is chosen such that the Hamilton function is 


composed of a potential and a kinetic energy according to p. 124. This works with 


_ (p — mF x)* m 


E +> fx, iff =f-—f(andF=0). 
2m 2 


Once again, the bar indicates the time average (F need not be zero, but this choice 
makes F? as small as possible, which has advantages), and thus 4 f x? is a potential 
energy. The given expression for E via the generating function 


G(t, p, x’) = A {5m (A+ AF) x’? — px} 


leads to the above-mentioned form H’ = Iw/A?, thus also allowed as a Hamilton 
function. Because t = ðE /ðp = (p — mF x)/m, the part (p — mF x)?/(2m) can be 
viewed as a kinetic energy > x2, Since p = —3E/ðx = (p —mFx) F — mfx = 
m(xF — fx), it turns out that ¥ = — fx. 


152 2 Classical Mechanics 


2.4.12 Summary: Hamiltonian Mechanics 


When searching for the time dependence, we tend to rely on conserved quantities. 
Therefore, momenta are often better to use than velocities. In the Hamiltonian for- 
malism, canonical transformations between position and momentum coordinates are 
permitted. Here, the difference between the two kinds of variables is blurred: we only 
talk about canonical variables in phase space. Because of the greater freedom in the 
choice of the phase space coordinates, even more suitable coordinates for a problem 
can be found than in Lagrangian mechanics. 

Moreover, formally, Hamiltonian mechanics is to be preferred because the Hamil- 
ton function H is the generating function of infinitesimal variations in time. The 
Liouville equation can be derived from this (important for statistical mechanics), 
and the Poisson brackets are also useful in quantum mechanics. 

According to the Hamilton-Jacobi theory, the Hamilton equations 


.. Of ; dH 
Ge == š = =- , 
gpp OPET Hage 


can be combined into a single partial differential equation which is useful also in 
light-ray optics, viz., 

L H(t a 0 

ae X, ZT I= > 

ot Ox 


where W is the action 


w= fra. 


Conversely, dW /dt delivers the Lagrange function and everything that follows like- 
wise from derivatives. 

The goal, namely to treat problems with many degrees of freedom with a single 
equation, is therefore achieved by Hamilton’s principle 


W =0, at 6¢=0. 


Since ôW = ic d(T + A) dt, it may even be applied to cases for which no potential 
energy exists, and hence there is neither a Lagrange function nor a Hamilton function. 


Problems 153 


Problems 


Problem 2.1 Determine the 3 x 3 matrix of the rotation operator D for a body as a 
function of the Euler angles a, 6, y, which are introduced in Fig. 1.10 on p. 30. (2 P) 


Problem 2.2 Verify the result for the following 7 special cases: no rotation, 180° 
rotation about the x-, y-, z-axis, and 90° rotation about the x-, y-, z-axis. Which 
Euler angles belong to these 7 cases? 

Hint: Here, occasionally only a + y or a — y are determined. (7 P) 


Problem 2.3 Which Euler angles {q, B. 7} belong to the inverse rotations? (Note 
that 0 <a@ < 2r,0 <B <z,and0 </ < 2r.) (3 P) 


Problem 2.4 The rotation operator and Euler angles are needed to describe a top. 
For this application, the original coordinate system is the laboratory system, the new 
system is the body-fixed system. Let the unit vectors be ly, ly, 1, or ky, ky, Kz. Let 
@, = yk, and determine the corresponding decomposition into œg = Beine öf nodes 
and wy = al, in the body-fixed and lab-fixed systems. 

Let a rotation be D = Da Dg. How do the vector A and the matrix M given by 


ay 0 a, ~a; 
A= |a and M= |—a, 0 a, 
az dy —ax 


transform under the rotation D? Note that the fact that M' = DMD-' = DMD 
shows the same behavior under a rotation as A’ = DA is connected to the notion of 
axial vector. (8 P) 


Problem 2.5 Is the tensor force F = aki T with (see p. 56) 


(m-r)m’+ (m’-r)m+(m-m’)r g m-r)(m -r)r 


T(r) = 5 -7 
curl-free? 
Hint: To investigate the singularity for r = 0, we may encircle the origin and apply 
Stokes’s theorem V x F = limy,_,9 4 Sw dr - F.) (8 P) 


Problem 2.6 Determine the potential energy V for Problem 2.5, and check that 
F=-—VV. (4 P) 


Problem 2.7 A circular disk of radius R rolls, without sliding, on the x, y-plane. In 
addition to the two coordinates (x, y) of the point of contact, the three Euler angles a, 
B, y arise, because the normal to the circular disc has spherical coordinates (£, œ), and 
y describes the rotation of the disc. The problem requires five coordinates with finite 
ranges, having five degrees of freedom (see Fig. 2.35). However, the static friction 
also delivers two differential conditions between the coordinates on the infinitely 
small scale: 


154 2 Classical Mechanics 


Fig. 2.35 Rolling circular 
disc with normal n. The 
Euler angles a and £ are 
shown, but not the rolling 
angle y. The orbit arises 
from the rotation about the 
normal in the positive sense 


Fig. 2.36 Crank motion. 
The pinion runs on a circle 
of radius R and moves a 
connecting rod of length L 


e How do the constraints for the virtual displacements read? 

e How many degrees of freedom does the disc have on the infinitely small scale? 

e Why do the equations ®(a@, $, y, x, y) = Ohere lead to inner contradictions? (Why 
are the constraints non-holonomous)? 


(8 P) 


Problem 2.8 How do the Lagrange equations of the first kind read in statics if the 
constraints are given only in differential form (as in the last problem), namely through 
bee nm Ox” = 0? Here n counts the 3N — f constraints. 

Use this for Problem 2.7 to determine the Lagrangian parameters, and interpret 
the connection found between the generalized forces. Show in particular that, in the 
contact plane, a tangential force acts on the disc, that F, cancels its torque, and that 
both Fy and Fg are equal to zero. (8 P) 


Problem 2.9 How strong does the force F> (F1, gy) at the crank in Fig. 2.36 have to 
be for equilibrium? Determine this using the principle of virtual work. (4 P) 


Problem 2.10 What does one obtain for this crank from the Lagrange equa- 
tions (Cartesian coordinates with origin at the center of rotation)? Do the results 
agree? (8 P) 


Problem 2.11 How much does the eccentricity € differ from 1 for a given axis ratio 
b:a < 1 of an ellipse? Relate the difference between the distances at aphelion and 


Problems 155 


perihelion for this ellipse to the mean value of these distances, and compare this with 
the axis ratio b/a. What follows then for small ¢ if we account only for linear, but no 
squared terms in £? (For the orbits of planets around the Sun, £ < 0.1.) For Comet 
Halley, e = 0.967 276 0. How are the two axes related to each other, and what is the 
ratio of the lowest to the highest velocity? (2 P) 


Problem 2.12 Let the polar angle y = 0 be associated with the aphelion of the orbit 
of the Earth (astronomers associate p = 0 with the perihelion) and the polar angle pf 
with the beginning of spring. Then ¢ increases by x /2 at the beginning of the summer, 
autumn, and winter, respectively. In Hamburg, Germany, the lengths of the seasons are 
Tsp = 92 d 20.5h, Tsu = 93d 14.5h, Tr, = 89d 18.5h, Twi = 89d 0.5 h. Determine 
Psp and £, neglecting squared terms in € compared to the linear ones. (6 P) 


Problem 2.13 By how much is the sidereal day shorter than the solar day (the time 
between two highest altitudes of the Sun)? By how much does the length of the 
solar day change in a year? (The result should be determined at least to a linear 
approximation as a function of £. Then for e = 1/60, the difference between the 
longest and shortest solar day follows absolutely.) (4 P) 


Problem 2.14 Why are the following three theorems valid for the acceleration rf = 
—kr (and constant k > 0)? 


e The orbit is an ellipse with the center r = 0. 
e The ray r moves over equal areas in equal time spans. 
e The period T does not depend on the form of the orbital ellipse, but only on k. 


Hint: Show that at certain times r and r are perpendicular to each other. With such 
a time as the zero time, the problem simplifies enormously. (8 P) 


Problem 2.15 What is the kinetic energy T); of a mass mz in the laboratory sys- 
tem after the collision with another mass my, initially at rest, taken relative to 
its kinetic energy Tz, before the collision as a function of the scattering angle 
Os (in the center-of-mass system) and of the heat tone Q or the parameter £ = 
J 1+ (my +mz2)/m, - O/ Tr? How does this ratio read for equal masses and elastic 
scattering as a function of the scattering angle in the laboratory system? (4 P) 


Problem 2.16 What is the angle between the directions of motion of two particles 
in the laboratory system after the collision? Consider the special case of elastic 
scattering and in particular of equal masses. (4 P) 


Problem 2.17 Two smooth spheres with radii Rı and Rz collide with each other 
with the collision parameter s (see Fig. 2.37). How large is the scattering angle 
Os? (2 P) 


Problem 2.18 How high is the mass m; of a body initially at rest, which has col- 
lided elastically with another body of mass mz and momentum pz, if it is scattered 
by 4, = 90° and keeps only the fraction q of its kinetic energy in the laboratory 
system? (3 P) 


156 2 Classical Mechanics 


Fig. 2.37 For the collision 
of two smooth spheres in the 
center-of-mass system, the 
component of p in the 
direction of the line 
connecting the sphere 
centers becomes reversed, 
the component perpendicular 
to it is conserved (see 
Problem 2.17) 


Problem 2.19 A spherical rain drop falls in a homogeneous gravitational field with- 
out friction through a saturated cloud. Its mass increases in proportion to its sur- 
face area with time. Which inhomogeneous linear differential equation follows for 
the velocity v, if instead of the time, we take the radius as independent variable? 
What is the solution of this differential equation? (In practice, we consider only the 
momentum « rv as an unknown function.) Compare with the free fall of a constant 


mass. (7 P) 


Problem 2.20 How can we show using Legendre polynomials that the gravitational 
potential is constant within an inhomogeneous, but spherically symmetric hollow 
sphere, and therefore that it does not exert there a gravitational force on a test body. 
How does the potential read if a sphere with radius rı and homogeneous density p1 
is covered by a hollow sphere of homogeneous density p) and external radius r2? 
(The Earth has a core, mainly of iron, and a mantle of SiO2, MgO, FeO, and others, 
approximately 2900 km thick.) (6 P) 


Problem 2.21 What height is reached by a ball thrown vertically upwards with 
velocity vo? Consider the friction with the air (Newtonian friction) and determine 
the frictional work done, by integration as well as by comparing heights with and 
without friction. (8 P) 


Problem 2.22 A horizontal plate oscillates harmonically up and down with ampli- 
tude A and oscillation period T. What inequality is obeyed by A and T if a loosely 
attached body on the plate does not lift off? (2 P) 


Problem 2.23 A car at a speed of 20 km/h runs into a wall and is then evenly 
decelerated, until it stops, at which point it has been deformed by 30 cm. What is the 
deceleration during the collision? Can a weightlifter who can lift twice the weight 
of his body protect himself from hitting the steering wheel? If two such cars with 
relative velocity 40 km/h hit each other head-on, are the same processes valid for the 
single drivers as above, or do double or fourfold forces arise? (4 P) 


Problem 2.24 Prove the following theorem: For each plane mass distribution, the 
moment of inertia with respect to the normal of the plane is equal to the sum 


of the moments of inertia with respect to two mutually perpendicular axes in the 
plane. (1 P) 


Problems 157 


Problem 2.25 Derive from that the main moments of inertia of a homogeneous 
cuboid with edge lengths a, b, and c. (1 P) 


Problem 2.26 Determine the moment of inertia of the cuboid with respect to the 
edge c using three methods: 


e As in Problem 2.25. 
e Using Problem 2.25 but dividing up a correspondingly larger cuboid. 
e Using Steiner’s theorem. 


(2 P) 


Problem 2.27 Decide whether the following claim is correct: The moment of inertia 
of a rod of mass M and length | perpendicular to the axis does not depend on the 
cross-section A, and with respect to an axis of rotation on the face is four times as 
large as with respect to an axis of rotation through the center of mass. (2 P) 


Problem 2.28 Prove the following: Rotations about the axes of the highest and 
lowest moments of inertia are stable motions, while rotations about the axis of the 
middle moment of inertia are unstable. 

Hint: Use the Euler equations for the rigid body, and make an ansatz for the angular 
velocity w = wı + d with constant œ; along a principal axis of the moment of 
inertia under small perturbations w = 6 exp(At) perpendicular to it. This implies a 
constraint for A(1), h, B, œw). (4 P) 


Problem 2.29 How high is the Coriolis acceleration of a sphere shot horizontally 
with velocity vo at the north pole? Through which angle ¢ is it deflected during the 
time t? Through which angle does the Earth rotate during the same time? (2 P) 


Problem 2.30 A uniform heavy rope of length / and mass jz / hangs on a pulley of 
radius R and moment of inertia J, with the two rope ends initially at the same height. 
Then the pulley gets pushed with 6(0) = wo. Neglect the friction of the pulley about 
its horizontal axis. As long as the rope presses on the pulley with the total force 
F > Fo, the static friction leads to the same (angular) velocity of rope and pulley— 
after that the rope slides down faster. How does the (angular) velocity depend on the 
time, up until the rope starts sliding? What is the difference in height of the ends of 
the rope at this time? (8 P) 


Problem 2.31 Show that the homogeneous magnetic field B = Be, may be asso- 
ciated with the two vector potentials Ay = +(B x r) and Ap = Bxey, (gauge invari- 
ance). What scalar field y leads to Vw = A; — A2? What is the difference between 
the associated Lagrange functions L; and L2? Why is it that this difference does not 
affect the motion of a particle of charge g and mass m in the magnetic field B? (6 P) 


Problem 2.32 Two point masses interact with V (|r; — r2|) and are not subject to 
any external forces. How do the Lagrange equations (of the second kind) read in the 
center-of-mass and relative coordinates? (4 P) 


158 2 Classical Mechanics 


Fig. 2.38 Double pendulum made from rods of mass mı and m2 and with moments of inertia Jı 
and h with respect to the hinges (e), which are separated by a distance /. The distances of the 
centers of mass of the rods from the hinges are sı and s2, respectively 


Problem 2.33 For the double pendulum in Fig. 2.38, determine T as well as V 
as a function of 01, 62, 01, and 62. How are these expressions simplified for small 
amplitudes? (6 P) 


Problem 2.34 In the last problem, let 6; = 6. = 0 for t < 0. At time t = 0, the 
upper pendulum obtains an impulse, in fact with angular momentum L with respect 


to its hinge. What initial values follow for ĝi and 6, in particular for the mathematical 
pendulum? (4 P) 


Problem 2.35 A homogeneous sphere of mass M and radius r rolls on the inclined 
plane shown in Fig. 2.39 (with g-e, = 0). Its moment of inertia is Z = 2 Mr’. 
Determine its Lagrange function and the equations of motion for the coordinates 
(x, z) at the point of contact. (Here we use z instead of y, in anticipation of the next 
problem.) (3 P) 


Problem 2.36 Treat the corresponding problem if the plane is deformed into a cylin- 
drical groove with radius R and axis parallel to e, (see Fig. 2.40). (Instead of x, it is 
better to adopt the cylindrical coordinate p with g = 0 at the lowest position.) How 
large may ~(0) be at most, if we always have |g| < 5 x and g(0) = 0? (4 P) 


Fig. 2.39 Oblique plane 
with inclination angle a (the 
angle between the 
downwards oriented normal 
and the vertical), whence 
g-e, = gsina 


Problems 159 


Fig. 2.40 Sphere in a 
groove. A sphere with radius 
r rolls on a circle of radius 
R, then with the angles y 
and g shown, the relation 

r (Y + p) = R ọ holds 


Problem 2.37 Determine the resonance angular frequency wg of a forced damped 
oscillation and show that the frequency wo of the undamped oscillation is higher than 
wr. What is the ratio of the oscillation amplitude for wp to that for wọ? What is the 
approximate result for y < wo? (4 P) 


Problem 2.38 What differential equation and initial values are valid for the Green 
function G(t) for the differential equation ¥ + 2yx + wax = f(t) of the forced 
damped oscillation with solution written as x(t) = fi oo J (t') G(t — t') dt’? Which 
G(T) is the most general solution, independent of f(t)? With this, the solution of 
the differential equation may be traced back to a simple integration—check this for 
the example f(t) = c cos(œt) in the special case y = wo (> 0). (9 P) 


Problem 2.39 What equation of motion is supplied by the Lagrange formalism for 
the double pendulum investigated in Problem 2.33 in the angle coordinates 6, and 
02, exactly on the one hand, and for restriction to small oscillations on the other (i.e., 
taking 6, and 62 and their derivatives to be small quantities)? (4 P) 


Problem 2.40 Which normal frequencies w+ result for this double pendulum? 
Determine the matrices A and B. Investigate also the special case of the mathe- 
matical pendulum with sı = /, and use the abbreviation o = s2/sı and u = m2/m,, 
where the normal frequencies are given here at best as multiples of the eigenfrequency 
@, of the upper pendulum. (6 P) 


Problem 2.41 Determine the normal frequencies and the matrices A, B, and C 
for the mathematical double pendulum with s2 = sı = l and mz « mı. (Here one 
should use the fact that u « 1 holds in C—why does one have to calculate w+ “more 
precisely by one order’’?) (6 P) 


Problem 2.42 What functions 6; (t) and 62(t) belong to the just investigated math- 
ematical double pendulum (with u <« 1) for the following initial values: 0;(0) = 
62(0) = 0, 6,(0) = —6,(0) = Q, which according to Problem 2.34 correspond to a 
collision against the upper pendulum for the double pendulum initially at rest? (Why 
do we only have to consider here the behavior of the normal coordinates?) Which 
angular frequencies do the beats have, and how does the amplitude of 6; behave in 
comparison to that of 62? (4 P) 


160 2 Classical Mechanics 


Problem 2.43 Prove the Jacobi identity [u, [v, w]] + [v, [w, u]] + [w, [u, v]] = 0 
for the Poisson brackets, with [u, v] = uxVp — UpVx and u, = du/dx, etc. (3 P) 


Problem 2.44 Determine the Poisson brackets of the angular momentum compo- 
nent L, with x, y, Z, Px, Py, Pz, and Ly. Note that, by cyclic commutation, [L, r], 
[L, p], and [Lis Lx] are then also proven. (5 P) 


Problem 2.45 Under which constraints is the transformation x’ = arctan(ax/p), 
p' = Bx?+yp* a canonical one? (3 P) 


Problem 2.46 Is the transformation x’ = x“ cos Bp, p’ = x“ sin Bp canonical? 
(2 P) 


Problem 2.47 Using the generating function 


/ 1 1 / 
G(x, Px’, Py, Py) =x 5 tps + Py} — Py {Px — py }/(@B), 
show that the Hamilton function 


H = -— (Cp: + = qByy? + (py — = BX)"} 
= zm YT 3P Py — 54? 

for a charged point mass in the plane perpendicular to a homogeneous magnetic field 

Be, can be written as the Hamilton function of a linear harmonic oscillation. (3 P) 


Problem 2.48 From this derive the transformation on p. 128. Show also, without 
using the generating function, that this transformation is canonical. Why does it not 
suffice here to compare the four derivatives dx'/dx, dx’/dy, Əy'/ðx, and dy'/dy 
with dp, /dpy', ƏPx/ƏPy', Ipy/Ipx', and ðpy/3py', as seems to suffice according to 
p. 126? (Whence an additional comment is missing here.) (4 P) 


List of Symbols 


We stick closely to the recommendations of the International Union of Pure and 
Applied Physics (IUPAP) and the Deutsches Institut fiir Normung (DIN). These 
are listed in Symbole, Einheiten und Nomenklatur in der Physik (Physik-Verlag, 
Weinheim 1980) and are marked here with an asterisk. However, one and the same 
symbol may represent different quantities in different branches of physics. Therefore, 
we have to divide the list of symbols into different parts (Table 2.3). 


References 161 


Table 2.3 Symbols used in mechanics 


Symbol Name Page reference 
* v,i Velocity 2 
* a, v, Ë Acceleration 2 
* F Force 55 
* M Torque 70 
M Total mass 71 
* m Mass 69 
* H Reduced mass T2 
* A Work 56 
* E Energy 78 
* V Potential energy 56 
* T Kinetic energy 70 
* T Oscillation period 104 
* p Density (massdensity) 81 
p Probability density 125 
* p Motional quantity, momentum 69, 93, 99 
* L Angular momentum 70 
* G Gravitational constant 623, 79 
G Generating function 130 
* g Free-fall acceleration 81 
* I Moment of inertia 87 
* w Angular frequency 67 
w Angular velocity 67 
y Generalized coordinate 60 
* Pk Momentum canonical conjugate to x* 93, 99 
Fy Generalized force 60 
* L Lagrange function 96 
* H Hamilton function 122 
* W Action function 135 
* S Characteristic function 136 
[u, v] Poisson bracket 124 
References 


1. M. Abramowitz, I.A. Stegun, Handbook of Mathematical Functions (Dover, New York, 1970) 

2. P.F. Byrd, M.D. Friedman, Handbook of Elliptic Integrals for Engineers and Physicists (Springer, 
Berlin, 1954) 

3. J. Meixner, F.W. Schäfke, G. Wolf, Mathieu Functions and Spheroidal Functions and Their 
Mathematical Foundations (Springer, Berlin, 1980) 

4. D.H. Kobe, K.H. Yang, Eur. J. Phys. 80, 236 (1987) 

5. A. Lindner, H. Freese, J. Phys. A 27, 5565 (1994) 


162 2 Classical Mechanics 


Suggestions for Textbooks and Further Reading 


6. W. Greiner, Classical Mechanics—System of Particles and Hamiltonian Dynamics (Springer, 
New York, 2010) 

7. L.D. Landau, E.M. Lifshitz, Course of Theoretical Physics. Volume 1—Mechanics, 3rd edn. 
(Butterworth-Heinemann, Oxford, 1976) 

8. W. Nolting, Theoretical Physics 1—Classical Mechanics (Springer, Berlin, 2016) 

9. W. Nolting, Theoretical Physics 2—Analytical Mechanics (Springer, Berlin, 2016) 

10. F. Scheck, Mechanics—From Newton’s Laws to Deterministic Chaos (Springer, Berlin, 2010) 

11. A. Sommerfeld, Lectures on Theoretical Physics 1—Mechanics (Academic, London, 1964) 

12. D. Strauch, Classical Mechanics (Springer, Berlin, 2009) 

13. W. Thirring, Classical Mathematical Physics: Dynamical Systems and Field Theories, 3rd edn. 
(Springer, New York, 2013) 

14. G. Ludwig, Einführung in die Grundlagen der Theoretischen Physik 1—4 (Vieweg, Braun- 
schweig, 1974) (in German) 

15. M. Mizushima, Theoretical Physics: From Classical Mechanics to Group Theory of Micropar- 
ticles (Wiley, New York, 1972) 


Chapter 3 A) 
Electromagnetism rie 


3.1 Electrostatics 


3.1.1 Overview of Electromagnetism 


The basic equations of electromagnetism were found by Maxwell in 1862. They 
comprise not only electricity and magnetism, but also (wave) optics (as electromag- 
netic radiation)—and thus a very diverse range of phenomena. Actually, most of 
this was known before Maxwell, but he discovered the displacement current and 
thus also correctly connected the time-dependent electric and magnetic fields for 
non-conductors. Since then the concept of fields has been accepted. 

We start from Coulomb’s law giving the force between two charges, and from 
this derive the electric field. Then we consider its action on polarizable media and 
discriminate between microscopic and macroscopically averaged quantities. The 
essential basic concepts are electric charge and polarization. 

We then consider moving charges and the Lorentz force. This will lead us to the 
concept of the magnetic field (the Biot—Savart law). Ampére’s molecular currents in 
microscopic conductor loops produce magnetic moments, but otherwise cannot be 
verified (as currents). The magnetic moments of elementary particles with spin 1/2 
(e.g., electrons) cannot even be attributed to currents in such microscopic conductor 
loops: like charges we have to accept them as non-derivable properties of these 
particles. Thus the coupling between two magnetic moments is likewise discarded as 
“basic”, in contrast to the force between electric charges and Coulomb’s law as the 
sole basis of electromagnetism—even if the scalar interaction between charges can 
be described in a simpler way than the tensor coupling between dipole moments. 

The conservation law of charges and Faraday’s induction law then result from 
Maxwell’s equations: 


© Springer Nature Switzerland AG 2018 163 
A. Lindner and D. Strauch, A Complete Course 

on Theoretical Physics, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-04360-5_3 


164 3 Electromagnetism 


OB 
vrai = V-B=0, 

. oD 
V-D= pop, VxH=j+—. 

ot 

The various quantities have the following names: 
E electric field strength, B magnetic displacement field (induction), 

D electric displacement field, H magnetic field strength, 


p charge density, j current density. 


The term 0D/0dt is the density of the above-mentioned displacement current. 

Maxwell’s equations connect on the one hand E with B and on the other hand D 
with H. Therefore, E and B are also sometimes called field strengths, while D and 
H are referred to as excitations. The last two equations in particular contain further 
fields, viz., the charge and current densities. However, the two quantities E and B 
supply the force on a test charge. Here we have to know how D and E as well as H 
and B are connected—only then are the source and curl densities of the fields given, 
whereupon the basic theorem of vector analysis on p. 25 becomes applicable. 

The wave equations for the fields result from Maxwell’s equations with D œ E 
and H « B. Then waves can propagate in empty space with the velocity of light 


co = 299 792 458 m/s. 


This is the same in all inertial frames, which leads to Lorentz invariance, something 
we shall discuss after dealing with Maxwell’s equations. Then the four equations 
for the three-vectors appearing above will be derived from two equations for four- 
vectors. 

After that we shall consider the electromagnetic radiation field, which is produced 
by an accelerated charge, similar to the electric field of a charge at rest and the 
magnetic field of a uniformly moving charge. 

Here we shall comply with the international system of units (SI). In addition to 
length, time, and mass with the units m, s, and kg, a basic electromagnetic quantity 
is introduced, namely the current strength with the unit A (ampere). Then further 
units are related to these, e.g., 


volt V = W/A, ohm 2 = V/A = 5S7! (siemens), 
coulomb C = As, farad F = C/V =Ss, 
weber Wb = Vs, henry H = Wb/A = Qs, 


tesla T = Wb/m? . 
In the international system of units, a magnetic field constant is necessary, Viz., 


uo = 4r x 107 H/m = 4r x 107 N/A? , 


3.1 Electrostatics 165 


and an electric field constant, viz., 


£9 = = 8.854187817622 ... x 107"? F/m. 


co” Lo 
Here, {19/47 appears in many equations for point charges and dipole moments, as 
does 1/47ré9 = co? Lo /47, and Coto = (coéo) | = 376.7303134618.... 2 is the 
so-called wave resistance in empty space, mentioned on p. 222. 

However, in theoretical and atomic physics, the Gauss system of units is also often 
used. There, the electromagnetic quantities are introduced differently (despite the 
warning above: Coulomb’s law is taken as the starting point from which Maxwell’s 
equations have to be derived, while the international system starts from Maxwell’s 
equations and deduces Coulomb’s law), but irritatingly the same names and letters 
are used. If we denote the quantities in the Gauss system with an asterisk, we have 


E* = /47 & E, Bt = /4r/up B, 
Dt = V4n/& D, H* = 47 wo H, 
p* = P/N 4T £0 , j* =j/V47 £0. 


Then Maxwell’s equations appear in the form 


1 əB* 
V x BX = -—— ; V-B*=0, 
Co ot 
. An, 1 aD* 
V -D* = 4r p*, V x Ht = —j*+— 
Co Co ðt 


Here, further factors occur in Maxwell’s equations. Particularly bothersome are the 
factors 4x. They occur in the Gauss system in plane problems and are missing in 
spherically symmetric ones. The difference between the two systems is dismissed as 
a problem of units, even though the equations deal with quantities that do not depend 
at all upon the chosen units (see Sect. 1.1.1). However, different notions generally 
have different units. Thus, in the Gauss system for B*, the gauss (G) is used and for 
H*, the oersted (Oe). They are both equal to y g/cm s*, whence B* and H* are also 
easily confused. For the transition between the two unit systems, we have 10 kG = 
1 T and 47x mOe = 1 A/m. 

Particularly elaborate are the textbooks by Jackson and by Panofsky and Phillips 
(see the recommended textbooks on p. 274). The first employs the Gauss system in 
earlier editions, but since then both have used the international system. 


3.1.2 Coulomb’s Law—Far or Near Action? 


In classical mechanics, mass is associated with all bodies. Some of them also carry 
electric charge Q, as becomes apparent from new forces—for point charges we 
usually write g. An electron, for example, has the charge 


166 3 Electromagnetism 
qe = —e = —1.602176462(63) x 107!° C, 


and the proton the opposite charge. There are charges of both signs (in contrast to the 
mass, which is always positive) and the excess of positive or negative charge results 
in the charge Q of the body. We thus introduce the charge density p(r), whereupon 
Q=fdV pcr). 

According to Coulomb (1785), there is a force acting between two point charges 
q and q’ (at rest) at the positions r and r’ in empty space, which depends upon the 
distance as |r — r’|~? and which is proportional to the product qq’ of the charges. 
Here the force is repulsive or attractive, depending on whether the charges have equal 
or opposite sign: 


/ 


1 qq r-r 


~ 4re9 |r = r'|? r-r] ` 


This is the force on the charge q. The one on q’ (at r’) is oriented oppositely, as 
required by Newton’s third law (action = reaction, see p. 55). The factor (4zre9)~! 
is connected with the concept of charge in the international system—it is missing 
in the Gauss system. Here £ọ is the electric field constant, and according to the last 
section, 


1 co’ H o Nm? 
= — — = 8.987551787368... x 10 
4T £0 107 m Cc 


Hence for electron and proton pairs, we have 


2 


An E0 


= 2.30707706(19) x 107” Jm = 1.439964 392(57) eV nm, 


where the last expression is suitable for atomic scales, and because eV nm = MeV fm, 
for nuclear physics. 

Coulomb’s law describes an action at a distance. But we may also introduce a 
field E(r) which surrounds the charge q’ and acts on the test charge q (r): 


q' r-r 


1 
F= E ih E(r)= A 
g(r) E(r), wi O= 2. pa Fa 


This electric field strength E is conveniently given in N/C = V/m. 

The concept of a field will be proven to be correct in the context of time-dependent 
phenomena, because actions propagate only with finite velocity, which contradicts 
the law of action at a distance. Therefore, we shall already use the field concept in 
electrostatics. 

A point-like charge g’ is thus associated with the electric field 


3.1 Electrostatics 167 


q' r— r’ q' 1 
E(r) = = v , 
4r ep |r — rj? 4rey |r—r'| 


the source of which is the charge q’ at the position r’, according to p. 25, and which 
is irrotational (curl-free): 


v-E=j/57-r) and VxE=0. 
E 


0 


From the point-like charge, we extend the notion to an extended charge with charge 
density p’. So far we have been dealing with the special case of p’ = q' 6(r — r’) 
and now generalize this to 


V-E=" and VxE=0. 
E0 


(Here, and in the next few equations, we should have p’ instead of p and Q’ instead 
of Q, but temporarily there will only be field-creating charges and no test charges, 
so we prefer to simplify the notation.) However, this is allowed only if the fields 
of the various point charges superimpose linearly—and if these charges remain at 
their positions when we move the test charge around as a field sensor. (Because of 
induction, this is not justified for conductors, as will become apparent on p. 181.) 
For charges distributed over a sheet, the normal component of the field strength 
thus has a discontinuity (see p. 28), while the tangential component is continuous: 


n- (Ey, —E_) = and nx (E,—E_)=0. 
0 


The two basic differential equations for the electrostatic field can be converted into 
integral equations using the theorems of Gauss and Stokes. Instead of the charge 
density p, only the charge Q = f dV p(r) enclosed in V is important: 


f ae-2 and [ arE=o. 
(Vv) E0 (A) 


According to the last equation, we also have f dr - F = q f dr - E = 0: no work is 
needed to move a test charge on a closed path in an electrostatic field, since the 
field is irrotational. The charge-free space is also source-free. Therefore, the field 
lines, with tangents in the direction of the field, can be taken as the lines forming 
the walls of the flux tubes (see Fig. 3.1 and also p. 12). Figures 3.2 and 3.3 present 
examples. For two source points, we take a series of cones around the symmetry axis 
with increasing units of flux and then connect appropriate intersections (Maxwell’s 
construction). 


168 3 Electromagnetism 


Fig. 3.1 Construction of field lines around point charges q. The same displacement field passes 
through the surface of spheres with radii r x ./|q]| around q (here q = q’ is assumed, and thus equal 
spheres). Disks of equal thickness are shown with dashed lines and hence with walls of equal area 
dA = 27R sina R|da| = 27 R |dz|, and also equal flux. In the next two pictures, the intersections 
of the straight lines with equal parameter sum or difference are to be connected, because what flows 
into a quadrangle © (solenoidal) (e.g., from below as 7 and N), must also emerge again (in the 
example, diffracted at the wall of the field-line tubes |). See also Problem 3.13 


Fig. 3.2 Field lines of two like charges—the ratio of the charges on the left is 1:1 and on the right 
3:1—with their saddle point between the two charges 


3.1.3 Electrostatic Potential 


The electrostatic force field is irrotational. Therefore, according to p. 25, we would 
like to attribute it to a scalar field ®, which will be much easier to calculate with than 
the vector field: 


, p(t’) 


1 
E=-V®, with ®(r) = gJ” a 
7 2 


3.1 Electrostatics 169 


Fig. 3.3 Field lines of two unlike charges—tratio of the charges again 1:1 and 3:1 on the left and 
right, respectively 


® is called the electrostatic potential, because it is connected with the potential energy 
Epot- (Note that here, and in thermodynamics, we use V to denote the volume, so we 
cannot use this letter for the potential energy, as is possible in classical mechanics 
and quantum mechanics.) As is well known (see p. 56), we have F = —V Epot, so 
here F = q E = —qV®. Therefore, 


Epo = a , 


and in classical mechanics (see p. 77), Epot = m® with the mechanical potential ®. 
Between two points rı and ro of different potential, there is a voltage: 


rı rı To 
U = on) - 900) = f ar-vo=— f ar-E= | dr-E. 


To To rı 


It can be positive or negative, but we are often concerned only with its absolute value. 
Since p/& = V - (—V ®), the potential follows from a linear differential equation 
with the charge density as inhomogeneous term, viz., the Poisson equation 


Rw. 


E0 


To obtain unique solutions, we have to set boundary conditions (to gauge) the solu- 
tion. The potential and its first derivatives must vanish at infinity, like the charge 
density. 

This boundary condition can also be introduced via Green’s second theorem 
(p. 17). Then one obtains the equation 


170 3 Electromagnetism 


1 1 
/ df’. (20) y v'a’) 
v) Ir—r’| r-r] 
i / f 1 1 f / 
= av'(o¢’) A vet’). 
v Ir -r'| Ir-r] 


Here the Poisson equation and A’|r — r'|7! = —4r 5(r — r’) holds, according to 
p. 26. Hence we obtain the “Green function solution” (see, e.g., Fig. 1.5 for the 
cylindrical capacitor, with field lines on the left field and equipotential lines on the 
right):’ 


4r O(r) = >f gy 2) 
y 


EQ Ir -r'| 
df’. V'a 1 
+f eee ae) -f df’. &(r') V’ 
wy [ror v’) |r — r'| 


The first integral is no longer taken over the whole space. The two surface integrals 
account for all charges outside of it and occur as new boundary conditions. In partic- 
ular, V’ can also be a charge-free space, such that the first integral vanishes. Then the 
potential and field strength are uniquely fixed by ® and V © on the surface. In charge- 
free space, these two vary monotonically, as follows from the Poisson equation, so 
the field has no extremum there. 

Incidentally, for a charge-free space, it is sufficient that either only ® or only (the 
normal component of) V ® is given on its surface. In particular, according to Gauss’s 
theorem, for A® = 0, we have 


[t-ove= fav v-eve= f av vovo. 


If two solutions ®; and ®2 of A® = 0 now satisfy the boundary conditions, then 
the surface integral of P = ®; — P, vanishes because of ®; = ®2 orn: VO; = 
n- V ®2. On the right, the integrand is nowhere negative. Consequently, everywhere 
in the considered volume, we have V®, = V®>, so ®, and ®, differ at most by a 
constant, and this can eventually be fixed by the gauge. 

In a finite regime, the same electric field can be generated by different charge 
distributions. The continuation across the boundaries is not unique. This should be 
considered if models for the charge distribution in inaccessible regions are presented. 


3.1.4 Dipoles 


So far we have allowed charges of both signs, but the test body should carry only 
charge of one sign, and as small as possible. 

Totally new phenomena arise if the test body carries two point charges +q of 
opposite sign. For simplicity, we assume that its total charge Q = f dVp(r) vanishes, 


3.1 Electrostatics 171 


otherwise we would also have to consider the properties of a monopole, which have 
already been treated. An ideal dipole consists of two point charges +q at the positions 
r4 = +} a, where a is as small as possible, but the product ga is nevertheless finite. 
We thus introduce the dipole moment 


p= fevrow, 


In the example considered, we would have p = qa. For finite a, higher multipole 
moments appear, i.e., integrals over p with weight factors other than r or 1, which 
we shall only discuss at the end of Sect. 3.1.7. If the total charge vanishes, the dipole 
moment does not depend upon the choice of the origin of r. 

However, in the following it will be advantageous, as in Sect. 2.2.2, to introduce 
center-of-charge and relative coordinates. Here we restrict ourselves to Q = 0 and 
choose R = f dVr|o|/ f dV |p | as “center of charge”. We derive the potential 
energy of the dipole p in the electric field E from a series expansion of the potential 
around the center of the dipole: 


®(R+r) = ®(R) —r-E(R)+--- because VO = —E. 


For Q = 0 and with Epot = f dVp(r) ®(r), this supplies the potential energy 
Epot = —p $ E . 


Here the field strength is to be taken at the position of the dipole. For a homogeneous 
field, it does not depend on the position. Then there is no force F = — V Epot acting on 
the dipole—the forces on the different charges cancel each other in the homogeneous 
field. 

In an inhomogeneous field, the forces acting on the two poles have different 
strengths. Then there remains an excess field 


F = —V Ep = V (p: E) = (p- V)E 


acting on the dipole—its “center-of-charge coordinate”. For the last equation, we 
have used V x E = 0 and constant p. 
In addition, there is a torque 


N=pxE, 


if p and E do not have the same direction—then the potential energy is minimal 
(stable equilibrium)—or opposite directions (unstable equilibrium). (Note that the 
letter M common in classical mechanics is reserved for the magnetization in elec- 
tromagnetism, and r x pE = pr x E.) The expression p x E supplies only the part 
expressible in relative coordinates. In addition, there is a part connected to r, the 
“center-of-charge coordinate” (the position of the dipole, so far called R), namely 
r x (p - V)E. Because p = (p- V) r, the sum can also be combined to 


172 3 Electromagnetism 
N=(p-V) (rx E). 
However, in many cases, only the torque p x E with respect to the center of the 


dipole is of interest. 
What field is generated by a dipole? To answer this question, we consider to 


begin with the potential of two point charges +q’ at the positions rj =r’ + ja and 
investigate the limit a < |r — ri |. Since 
r-r La! |r r+ tapi a-V[r—r|7! 

2 2 i 

we end up with 
1 f R r— r’ / 
Arey O(r) = —p' - V a Pi na _y,_?_., 
Ir= r'| =r] Ir = r'| 


Thus the scalar product of p’ with the unit vector e = (r — r')/|r — r'| from the 
source r’ to the point r is important. The potential decays in inverse proportion to the 
square of the distance. The field strength E = —V® decays more strongly by one 
power of the distance: 


1 3p-ee—p 4 
j= a n a E 


Arey E(x) = V (pV a 


r= r'| 


An example is shown in Fig. 3.4. The last term appears because |r — r’|~! is discon- 
tinuous atr = r’. Thus, the volume integral around this point must still be considered 
(see Problem 3.8). For a point charge, a delta function appears at V - E, thus ulti- 
mately with the derivative of the field strength—for the dipole this derivative is 
already included by taking the limit a — 0. We usually only require the field outside 
the source, so this addition is not needed, but it does contribute to the average field, 
in particular, for N dipoles p’ in a volume V with AE = — ieo! Np'/ V. We will 
take advantage of this in the next section, in the context of polarization. 

But first we consider also dipole moments, which will be distributed evenly over 
a sheet d f and lead to a dipole density P4. We then set df P4 = df P4. Note that 


Vy 
l 
i A 
l 
Fig. 3.4 Field lines of a dipole pointing along the dashed symmetry axis. Right: The field in the 


middle is magnified eight times. All other field lines are similar to the ones shown here, because 
point-like sources do not provide a length scale (see Problem 3.14) 


3.1 Electrostatics 173 


P4 can also be negative, because we have already selected the direction of df, if 
the surface of a finite body is intended (see p. 9). In particular, df should then point 
outwards. We obtain the associated potential 


1 df’ - (r—r’) 
(r) = / Par). 


lr —r’/3 


The fraction in the integrand gives the solid angle dQ’ subtended by the surface 
element df’ at the point r, where the sign changes on crossing the surface. Therefore, 
upon crossing the dipole layer, the potential jumps by P4/£o: 


P 
ð,- =, 
€0 


while, according to Sect. 3.1.2, upon penetrating a monopole layer, it is the field E 
that jumps, and hence the first derivative of ®: 


n- Œ, —E_) =—-n-(VO,—Vo_)= =. 
0 


We may therefore replace the boundary values on p. 170 by suitable mono and dipole 
densities on the surface of the considered volume. Then we have 


d 
f pa and — df = 
E0 €0 


df P4 


’ 


df- V = -df n -E = 


if the potential (®) and the field (E) vanish outside (see Fig. 3.5). 


220o/ oA R 


Fig. 3.5 Potential (upper) and field strength (lower) along the axes of circular disks of radius R. 
The disk is loaded with monopole charge density pa (left) or with dipole charge density P4 (right), 
where the potential discontinuity at the disk also leads to a delta function. The curve lower right 
diverges at z = 0 (see Problems 3.18-3.19) 


174 3 Electromagnetism 


3.1.5 Polarization and Displacement Field 


So far it has been presumed that the charge distribution is also known in the atomic 
interior. But in most cases such microscopic quantities are irrelevant. In macroscopic 
physics, knowledge of the average charge density is sufficient, if in addition the 
average density P of the dipole moments is also used. The average is taken over many 
atoms, as long as the volume AV of the averaging process is sufficiently small. For 
N molecules (ions) in AV with charges q; and dipole moments p;, we have 


N 


1 1 č 
P= ay ha and BS ay oP 


P is called the (electric) polarization. According to the last section, it is associated 
with the potential 


1 , Pr) 1 : ; , i 
— V- dv = dV’ P(r). V ; 
ÅT £9 |r — r'|  4TEo Ir -r'| 


The last expression can be rewritten. Since 


(r) = 


O Pœ) VOP 
r-r] r-r] Ir-r] 


’ 


Gauss’s theorem yields 


1 f , Pr) 1 f , VP 
P(r) = df’ - = dV ; 
4T £9 Ir—r’| 4TEo Ir -r'| 
A polarized medium then has the same field as the one due to the surface charge 
density o4’ = n - P and the space charge density p’ = —V - P. The minus sign is 
easy to understand. If we assume a rod of homogeneous polarization, then a positive 
charge results just there on its surface where the polarization has a sink. For p’, 
we sometimes speak of apparent charges, because they actually belong to dipole 
moments and are not freely mobile. This concept is somewhat misleading, however, 
since the apparent charges do exist microscopically. 

If we integrate over the total space, then the surface integral does not contribute, 
since there is no matter at infinity. Clearly, we may replace the microscopic (“true”) 
charge density p by the average p and the charge density p’ = —V - P of the polar- 
ization: 


p=p-—V-P. 


From the basic equation p = V - €9E of microscopic electrostatics, we can in this 
way infer p = V- (€9E + P). Therefore, we introduce the electric displacement 
field (displacement) D, defined by 


3.1 Electrostatics 175 
D = &E +P, 
and obtain as basic equations of macroscopic electrostatics 
V-D=P, VxE=0. 


The electric field remains irrotational because, according to our derivation, we may 
later calculate with a scalar potential. These basic equations also yield 


n: (D, -D-)=7p4, nx ,-E_)=0, 


and 


f ap=o, f aæ-£=0. 
(V) (A) 


Like the polarization P, the electric displacement field D has the unit C/m?. 

So far we have viewed the dipole moments as being given. There are indeed 
molecules with permanent electric dipole moments; by allusion to paramagnetism, 
they are said to be paraelectric. But if no external field is applied and if the temper- 
ature is high, then the polarization averages to zero because of the disorder in the 
directions. The orientation increases with increasing field strength—and decreas- 
ing temperature. In addition, an electric field also shifts the charges in originally 
non-polar atoms and induces an electric dipole moment. In both cases, to a first 
approximation, P depends linearly upon E: 


P= Xe &oE. 


The electric susceptibility Xe is a mere number. It is related to the polarizability a of 
the various molecules. To this end, we assume N equal molecules in the volume V, 
whence P = np with n = N/V. Each individual molecule becomes polarized by the 
electric field Eo at its location, p = wéqKo. In doing this, according to the last section, 
we assume that Eo differs from the average field strength E by te ‘np = 560 'P: 


na 


P = na (€E + +P) = Paien 


E. 


We deduce the formula due to Clausius and Mosotti, viz., 


_ na 3 Xe 
= I- na/3 = axt3 


Xe 


which has been derived here following [1]. 
However, in crystals there may be preferential directions, such that P is not then 
parallel to E. In this case, Xe is a symmetric tensor of second rank (an anti-symmetric 


176 3 Electromagnetism 


part would supply an additional term Pa = Xe Xx €9E witha suitable vector Xe, which 
contradicts the above-mentioned explanation for the polarization: even if there were 
microscopic screw axes, the polarization would nevertheless average out) with three 
principal dielectric axes, along which P is parallel to E, but P/E is still different. 
There are also ferroelectric materials—in these a permanent polarization appears 
even when the field is switched off, and the dipole moments do not average out. 
In addition, x. does not remain constant at high fields because there are non-linear 
saturation effects. We will not go into all these special cases. 
For the electric displacement field, it thus follows that 


D = (1 + Xe) e&oE = cE, 


with the permittivity (dielectric constant) € and the relative dielectric constant €, = 
€/& = 1 + Xe. This depends upon the temperature and for water is unusually high, 
namely equal to 80 at 20°C and 55 at 100 °C. In crystals, £ is generally a (symmetric) 
tensor of second rank. 

We will now always consider the two fields simultaneously, i.e., the electric field 
strength determined by the force F = q E acting on a test charge q, and the electric 
field D given by the average charge density. When we do this, the relation between D 
and E is important (D = £E), and we will restrict ourselves to scalar permittivities. 


3.1.6 Field Equations in Electrostatics 


In the following we will restrict ourselves to macroscopically measurable quantities 
and, following the usual practice, omit the bar over the charge density. Thus we start 
from the basic equations 


V-D=ọp, VxE=0, and D=cE, 


and consider now different cases. 
Insulators do not contain mobile charges, and therefore in their interior we have 


V-D=0, VxE=0. 
Since they can be polarized, we have to distinguish carefully between D and E. 
According to the second equation, we may replace the field strength E by — V ® and, 
using D = cE from the first equation, we obtain 


V-eVO=0. 


In particular, for constant permittivity, we obtain the Laplace equation 


3.1 Electrostatics 177 


Fig. 3.6 Diffraction of the electric field entering into an insulator of higher permittivity (here 
€4 = 2e_). The force lines become diffracted away from the normal. In contrast, according to the 
optical diffraction law (see p. 220) the rays become diffracted towards the normal for n+ > n— and, 
instead of tana, we have in that case sina 


Ağ =0. 


The boundary values then become physically decisive. 

For two-dimensional problems, analytical functions in the complex plane are 
useful. A function f(z) = ®(x, y) + iY (x, y) is only differentiable if, regardless of 
the direction of approach, 


ff  _, ap iav ap idt 
əx idy ax dx  iðy iðy 


i.e., if the Cauchy—Riemann equations are satisfied: 


ab OW 
— = — an : 
Ox dy dy Ox 


These lead to the Laplace equations A® = 0 and AW = 0, and thus to V®- VY = 
0. If the entity {® = const.} represents equipotential lines, then the other entity 
{WY = const.} represents field lines. For example, Fig. 1.4 corresponds to f = z?. 

At the interface between insulators, the normal components of D and the tangen- 
tial components of E are continuous because 


n-(D,—-D_)=0, nx (E,—E_)=0. 


Hence, for scalar permittivity, it follows from |n x E;|/|n-D,| = |n x E_|/|n- 
D_| that sina;/(e4 cosa;) = sina_/(e_cosa_), where œ is the angle between 
the field vector and the normal vector. Hence (see Fig. 3.6), 


tana, E+ 


tan a_ E 


178 3 Electromagnetism 


In homogeneous conductors, charges can move freely. Therefore, for static equi- 
librium the field strength in the interior of the conductor must vanish, and with it also 
the polarization: 


E=D=0, andthus ®= const. 


in the interior of homogeneous conductors. 

At the interface between insulator and conductor, there may be surface charges, 
but no fields within the conductor. Therefore, the electric field lines in the insulator 
end perpendicularly at the interface: 


n-Di=pa, nxE=0. 


The subscript on E; reminds us that we are considering the insulator, but it is not 
actually needed, since the fields vanish within the conductor. 

At the interface between two conductors, the potential has a discontinuity, since 
their conduction electrons generally have different work functions. Upon contact 
between the two metals, charges move into the more strongly binding regime, until 
a corresponding counter-field has built up. Only then is there a static situation. Thus 
we find a contact voltage. The situation for the immersion of a metal in an electrolyte 
is similar, e.g., immersion of a copper rod in sulfuric acid, where some Cu*t ions 
become dissolved and hence a current flows until the negative loading of the rod has 
built up an electric counter-field. 

All these fields caused by inhomogeneities are said to be induced, since they 
do not originate from an external charge, but from the structure of the material. We 
denote the induced field strength (as do Panofsky & Phillips) by E’; another common 
notation is E ©, In electrostatics, we have 


E+E’=0 


in inhomogeneous conductors—in static equilibrium the induced field strength is 
canceled by the counter-field. 


3.1.7 Problems in Electrostatics 


In most cases, the field E(r) in the insulator is to be determined for a given form and 
position of the conductors and with a further requirement: we are given either the 
voltage 


rı 
u=% -%=- f dr- E 


To 


3.1 Electrostatics 179 


between the conductors 0 and 1—where arbitrary initial and final points on the 
conductors may be taken (because the potential on each conductor is constant) and 
any path in-between, because the field is irrotational—or the charges 


a= | af pa =| at-en =| df -eV® 
A Aj Aj 


i i i 


on the conductor surfaces A;. For two conductors with charges Q > 0 and — Q and 
the voltage U > 0, Q and U are related to each other via a geometrical quantity, 
namely the capacity 


cafe 
U 


faf- E 
fa-E| 


The best approach here is to solve the problem using Gauss’s theorem or using the 
Laplace equation, and to adapt the coordinates to the boundary geometry. In the 
following, we consider some examples whose solution can be easily anticipated. 

The spherical capacitor is a conducting sphere with charge Q and radius rx ina 
comparably large (non-conducting) dielectric with scalar permittivity £. This has a 
spherically symmetric field, which jumps from 0 to its maximum value at the charged 
surface—viewed from outside the sphere, it could also originate from a point charge 
at the center of the sphere: 


Dr) = U, _ 410 forr < rx, 
ae Urg/r, ~ )Urgr/r forr>rx, 


withU = Q/CandQ = f df - D = 4r rg? cE (res) = 4nre U rg, Whence the capac- 
ity is C = 4r € rg. (The potential has a kink at the charged surface.) 

As a cylindrical capacitor, we take two coaxial conducting cylinders of length J, 
separated by a dielectric with scalar permittivity £. If the inner cylinder (with radius 
Rj) carries the charge Q and the outer cylinder (with radius R,) the charge —Q, 
then for / >> Ra, the contribution of the cylinder ends may be neglected. Then in the 
dielectric there is a field strength decaying as R~! which is the solution of Gauss’s 
theorem Q = f df - cE, since the area of the inner cylinder walls is A = 27 R; L, and 
Q = 2r RI €E(R) and ® œ In(R/R,) in the capacitor: 


U, 0 for R< Ri, 
In(R/R, U 
ame dy MEAD.. ee for Ri < R < R., 
in(Ri/Ra) In(R,/Ri) R? 
0, 0 for R, < R, 


noting that —In(Rj/R,) = In(R,/R;). Hence we find Q = 27 R; l £ U/{ R In(R,/Ri)} 
and then C = 27e1/In(R,/ Rj). For conductors (with the very small distance d = 
Ra — Ri & R; and area A = 27 Ril for the inner conductor), and since In(R,/R;) = 
In(1 + d/R;) ~ d/R;, we may replace this by 


180 3 Electromagnetism 


IUI _ 12l 


i E x — x —. 

d €A 
These equations are also valid for the plate capacitor, if boundary effects may be 
neglected. 

When capacitors with capacities C are connected in parallel, the total capacity 
C = Q/U = } Cr, because U = U; and Q = >>, Qx. For capacitors connected 
in series, we have 1/C = }_, 1/Cy, because now Q = Q; and U =}, Ug = 
di Q/ Ck. 

For a point charge q at a distance a in front of a conducting plane the field lines 
must end perpendicularly on the plane and must be irrotational in front of it. In order 
to find the field distribution, we imagine an image charge —q at the same distance a 
behind the conductor surface—the field of the two point charges is shown on the left 
in Fig. 3.3. The total field of the two point charges satisfies the conditions in front of 
the plane. Hence, if we choose the center of this configuration as the origin, so that 
q is at a and —q at —a, we have been 


r—a r+a ; 

= q ( SE forr-a>0O, otherwise 0. 
4re\jr—-aĵ |r+al? 

This field is irrotational and has a source in front of the interface only at the position 

of the point charge. On the interface, r - a = 0 holds, so |r tal? = (r? + a°)’, 

and hence, 


q a 
Qne (r2 +a?)3/2 ` 


Therefore, E is perpendicular to the plane as required. Behind the mirror there is 
no field. Therefore, we replace the imagined image charge now by a surface charge 
pa = n- D on the plane, precisely in the sense of the last paragraph of Sect. 3.1.3. 
The image charge is replaced by an induced charge on the conductor surface. The 
total induced charge is, according to the last two equations, equal to the image charge: 


[areas fat w= ef ci 2,2 Oe eo 

2x Jo (R?24+a07)3/2  /R24 G2 \5 
Of course, the total charge of the conductor must be conserved. We have to imagine 
a charge +q at infinity. For conductors of finite extension, it is important to know 
whether they are isolated or grounded—if necessary the image charge has to be 
neutralized by a further charge, e.g., for an ungrounded sphere, the additional charge 
has to be spread evenly over the surface. 

With the help of image charges, the fields of other charged interfaces can be rep- 
resented, e.g., for a conducting sphere (Problem 3.20) or for a separating plane to a 
non-conductor with a different permittivity. But then we have to calculate with dif- 
ferent charges q Æ g’—the field-line pictures in the half-space inside the conductor 


3.1 Electrostatics 181 


are similar to those on the right in Figs. 3.2 or 3.3—and in the half-space of the other 
non-conductor, the field of a new source appears at the original position. 

Each test charge leads to induced charges on the surrounding conductors and thus 
changes the field to be determined. Since this induction should remain negligible, 
the test charge must therefore be very small in comparison with the other charges. 
However, this is not possible for small distances because the induced charge is then 
very highly concentrated. Therefore, we may apply our concepts only to macroscopic 
objects. 

If the microscopic charge density p(r) is given, the potential and field strength 
follow from the Poisson equation A® = —p/£ọ or the integral f dV’ p(r’)/|r — r'|. 
Here, we would like to separate the variables r and r’. This is managed by expanding 
in terms of Legendre polynomials (see p. 81): 


1 jie ryn F 
— =- P,(cos@)(—) , forr <r and cos = 
Ir — r'| r r 

n=0 


According to this, and in particular, for positions outside the field-creating charges, 
we may set 


ie] f 
y —— | dV’ pr^ r” P,(cos6’) . 

n+l 

4T £0 P zg 


(r) = 


Upon integration, the angle between r and r’ changes, so we have written here cos 0’. 
For n = 0, the integral supplies the charge Q’ because Pp = 1. (In Sect. 2.2.7 we 
integrated over the mass density and hence obtained the mass.) The next integral leads 
top’ - r/r, because r’ P;(cos 6’) = r’ - r/r, and the dipole moment is thus important. 
Generally, the integrals appearing here are called multipole moments. (According to 
Sect. 2.2.7, we have (n + 1) P,41(z) — Qn + 1)z P,(z) +n P,_\(z) = 0. However, 
numerical factors are often added to the multipole moments.) For a dipole of finite 
extension (a Æ 0), there is, e.g., an additional octupole moment, but its influence 
decreases faster with the distance than that of the dipole moment (Problem 3.15). 

Apart from the spherical multipoles of order 2” just mentioned, there are (e.g., 
in ion optics) axial multipoles of order 2n. In suitable cylindrical coordinates, their 
potentials are proportional to R” cos(n@). 


3.1.8 Energy of the Electrostatic Field 


The electric field carries energy, because according to p. 169, work is required to 
load a capacitor, i.e., the work dW = U dQ to move the charge dQ > 0 from the 
cathode to the anode. Because Q = C U, if we let the voltage—or charge—increase 
from zero to its final value, we obtain 


182 3 Electromagnetism 


for the energy stored in the capacitor. 

Since p = V - D and therefore p ® = V - (®D) — D- V9, the expression W = 
5 QU= 5 Q AŤ = 5 f dV p ®can be rewritten. According to Gauss, the first term 
supplies a surface integral which vanishes at infinity, since ®D approaches zero as 
r—>, and we obtain 


w=} favpo=3favo-r. 


The contribution to the last integral comes from all space containing fields, but the 
contribution to f dV p ®, only from space containing charge. However, the energy 
density is only 


1 
1D-E, 


WwW = 
because, subdividing the space, the interfaces should not contribute—and p ® 
depends upon the gauge. In the sense of thermodynamics (see p. 575), we are dealing 
with the density of the free (fully usable) energy F. Temperature and volume here 
are the natural variables. The permittivity € generally depends upon the temperature 
and the distances between the molecules. However, we follow the general custom 
and write w and not f. (The symbol u is often also used, but this is misleading, since 
U means the inner energy and not the free—fully exploitable—energy.) 

Since D = £E + P, the energy density is composed of two parts. Firstly, the 
field energy $60E - E “in vacuum’, and secondly, the contribution IP - E from the 
dielectric—because according to p. 175 the dipole moment of polarizable molecules 
increases linearly with the applied field strength and requires the work in E-dp= 
ip - E. (This derivation succeeds only for P « E.) 


3.1.9 Maxwell Stress Tensor in Electrostatics 


Forces are transmitted from one space element to the next, in which case we can speak 
of near-action forces. Here this must also be true for empty space, because electric 
forces permeate even empty space. We expect space filled with fields to behave as 
an elastic medium, and this property is described by the Maxwell stress tensor. 

In a continuous medium, the force F can be derived from a force density f, which 
will be denoted in this section by f. For surface elements, we write nd f or g* d fx. 
Then, 


P= f avto. 


3.1 Electrostatics 183 


+A © -£ 


Fig. 3.7 Visualization of the Maxwell stress tensor for a homogeneous electric field (the field 
strength points from the @ to the © charges). Shown are the surface tensions on four cubes— 
depending on their charge, the forces on opposite faces either cancel or supply the expected force 
(see Problems 3.21-3.22) 


We decompose the force f dV acting on an infinitesimal cube dV = dx dy dz into 
surface element times surface tension o. For tensile and compressive forces, there 
are normal stresses perpendicular to the surface, while for shear forces, there are 
shear stresses on the surface (see Fig. 3.7). The mechanical stress (Latin tensio) is 
described by a tensor o, viz., 


dy dz {oxx (x + dx, y, Zz) — Oxx (x, y, Z)} 
dV fe = | +dzdx {o,,(%, y + dy, z) — oxy; y, z)} 
+ dx dy {0x:(x, y, z + dz) — ox: (x, y, 2} 


-dy Ee dOxy 4 r=) l 
ax dy Oz 

The force density f is thus equal to the source density of the stress tensor o—so far 

we have considered only divergences of vectors and have obtained scalars. According 

to Gauss’s theorem, the volume integral of f can be converted into a surface integral. 

Adjacent interfaces do not contribute, provided that o is continuous. 

The stress tensor is useful in continuum mechanics, where near-action forces are 
assumed. Therefore, we use it now in electromagnetism. However, here we shall 
restrict ourselves to homogeneous matter with constant permittivity, since otherwise 
the problem is much more involved (see [2]). We thus start from 


F= fav pE= fav evn. 


In order to convert the integrand into the source density of a tensor, we use Sect. 1.1.8, 
adding —D x (V x E) to the integrand. It does not contribute in electrostatics, since 
the field is irrotational—and the same procedure holds in magnetostatics, but where 
the field is solenoidal (source-free) and curls (vortices) appear instead. 

For rectilinear (possibly oblique) coordinates, according to pp. 33—40 with fixed 
vectors g', we find 


184 3 Electromagnetism 


dE; Dk x m): 


EV:D-Dx(VxE)=) g (= = 


ik 


using a x b = i 2! ‘akb'eg and the two equations V-a= >, T - and (V x 
i= >), 6!" 24, as well as the identity >) sje!" = 880 — ôw ôr. In addition, 
for homogeneous matter, i.e., an invariant permittivity tensor, we have 


a5 -an ðw 


>» Dg a 2 Í 


Oxi 
k 


and thus X; g/ 24 = gt = >. g g 4. Together these imply 


EV-D-Dx(VxE)= Dez (ED - ig" E-D). 


Therefore, we introduce the Maxwell stress tensor (in rectilinear coordinates): 


(In Cartesian coordinates, it has the trace trT = 3w — E- D = w > 0, and the trace 
does not depend upon the choice of coordinates. Some authors use it with opposite 
sign, but we shall see in Sect. 3.4.1 1 that this has disadvantages. In the form mentioned 
here, it is symmetric only for scalar permittivity. See also other forms discussed by 
Brevik.) With this, we find 


Consequently, the force density vector f is related to the divergence of the tensor T. 
Since we work with position-independent basic vectors g; and may employ 
Gauss’s theorem (dV = d fy dx* according to p. 38), we also find 


F+ oa f df, T =0. 
ik v) 


We can thus form Maxwell ’s stress tensor T“« from the field strength, which expresses 
the force on a volume due to surface forces. Its diagonal elements supply the com- 
pressive or tensile stress on the surface pair with equal index, and its off-diagonal 
elements supply the shear stress on the remaining surface pairs. 


3.1 Electrostatics 185 


3.1.10 Summary: Electrostatics 


In electrostatics, we investigate the effects of charges Q and charge densities p at rest. 
All phenomena can be derived from Coulomb’s law. It supplies the force between 
two point charges q and q’ in vacuum: 


qq r-r 
~ Arey r-ri ` 


From this action-at-a-distance law, we derived a field theory. We conceived of a test 
charge q and introduced a field strength E: 


F=qi) EW), with VxE=0 and V-E=£. 
E0 


However, the last equation is true only with the microscopic measurable charge 
density p, not with the macroscopic charge density p, which accounts for freely 
moving charges. 

We get round this difficulty by introducing dipole moments p and their density. 
This leads to the macroscopic concept of polarization P. Its action on a test charge 
can be described by a charge density — V - P. With the electric displacement field 


D=eE+P, 


we thereby obtain the source equation 


If the connection between the field strength E and displacement field D is known, 
the field can be determined. Maxwell’s equations of electrostatics read 


VxE=0, V-D=o, and D=cE. 


It is common to denote the measurable charge density by o. We will therefore omit 
the bar in the following. The first row of these equations yields 


f a-E=0 and f a-p=o. 
(A) (V) 


and also 
nx (E,—E_)=0 and n. (D, —D_)=p,. 


Since E is irrotational, this vector field can be attributed to a scalar potential ®, 
with which calculations are greatly simplified: 


186 3 Electromagnetism 
E=-—V®, V-(eV®)=—p. 
Then for constant and scalar permittivity, the Poisson equation follows: 
AG= =e. 

E 
Here, a boundary condition is appended, namely that the potential should vanish at 
infinity—and that there should be no charge there. Instead of that, conditions may 
be introduced at the surface of the considered volumes. 

In electrostatics, there are no fields in homogeneous conductors. Only at their 
interface with insulators are charges possible, and this supplies the boundary condi- 
tions: 


pa=n-D,=—-n-eVO, 0=nxE = -nx VO. 


Here the index I refers to the adjacent insulator. 


3.2 Stationary Currents and Magnetostatics 


3.2.1 Electric Current 


So far we have restricted ourselves to charges and dipole moments at rest. We shall 
now discard this restriction. We let the charges move and use the concept of current 
density: 

j= pv. 


We call the current flux through the cross-section A of a conductor the current 


strength: 
I = faa. 
A 


For a cross-section that is small compared to the other dimensions of the conductor, 
we often replace 


jdV > Idr, 


where dr is in the direction of j , since dV j > dr - df j = dr df - j. 

There is a conservation law for electric charges: if the charge Q in a time- 
independent volume V changes, then it must flow through the surface of V. (We 
can also state that the volume associated with Q has changed—but this serves 


3.2 Stationary Currents and Magnetostatics 187 


no purpose here.) Therefore, dQ /dt = Jy dV dp/dt = — 1H df - pv. Note that the 
vector normal to the cross-section points to the outside. If this is true for pv, 
then positive charge flows out, hence the minus sign. With Gauss’s theorem, viz., 
Joy) df -j = fy dV V - j, we have the continuity equation 


0p 
—+V-j=0, 
ot 
and for surface charges p4, 
OPA S 
ee etn n- 22 =0 i 
PP +n- GQ, —j-) 


where n is again the unit vector perpendicular to the element of the cross-section, 
from front to back (from j_ to j+). The continuity equation thus follows from charge 
conservation, and conversely, charge conservation from the continuity equation. 

In this section, we shall deal with stationary currents—then the charge density 
does not change anywhere as time goes by (3pọ/ðt = 0), and the current density is 
solenoidal (V -j = 0 and n - j = 0 at conductor surfaces). Only in the next section 
will we relax this restriction. 


3.2.2 Ohm’s Law 


Electric currents are generated in conductors by electric fields. The fields exert a 
force on the charged particles and accelerate them. If we apply a voltage U(> 0) at 
the ends of a conductor, then a current of strength 7 (> 0) will flow. The ratio U/I 
is the resistance R of the conductor: 


U=RI. 


According to Ohm’s law, the resistance depends on the properties of the conductor, 
but not on the applied voltage or the current. For a homogeneous conductor of length 
l and cross-section A, apart from its dimensions, it thus depends on the conductivity 
o: 


l 


R=—. 
Ao 


Since U = El and I = j A, for a homogeneous conductor, we find the differential 
form of Ohm’s law, viz., 


j=0o E. 


188 3 Electromagnetism 


In fact, the current density often depends linearly on the field strength. (The conduc- 
tivity o in some crystals is a tensor, because there are preferential directions—but 
we do not wish to deal with that here.) However, there are also counterexamples, as 
is to be expected, if we try to explain Ohm’s law. 

Actually, the field strength should accelerate the charges, since the field supplies 
a force, while the current density is proportional only to the velocity of the charged 
particles. This apparent contradiction in Ohm’s law is resolved as for free fall by 
invoking friction (see p. 85, and in particular Fig. 2.11). In a metallic conductor, 
the electrons always lose the energy they acquire by collisions with the lattice, and 
hence move with a constant drift velocity. The associated power appears as Joule 
heat, 


Fev= fav pb-v= favj-B=1 far-E=10, 


which heats the conductor. Furthermore, the conductivity often depends on the tem- 
perature, which limits Ohm’s law. 

Ohm’s law cannot be applied as such to superconductors, which conduct currents 
loss-free, and then only at the surface of the conductor or in special tubes (supercon- 
ductor of first or second kind, respectively). 

In the given differential form, Ohm’s law also holds only for homogeneous con- 
ductors (and insulators, which have o = 0). For inhomogeneous conductors, we must 
also consider the induced field strength: 


joo (E+E) =oE+j', 


where the conductivity o now also depends on position. The term j’ refers to the 
additional current density at the sources. 

Electric currents are immersed in magnetic fields, which in turn act on the 
currents—we shall now consider this. If we neglect this back-action, then stationary 
currents can be calculated easily, because for V - j = 0, V x E=0,andj = oE + j', 
we have 


V-coE=-V-j and VxE=0, 
n: (oE; —o_E_) = -n - j; and nx (E, -E_) =0, 


where the potential may be introduced everywhere instead of the field strength (E = 
—V®). The current density j’ is thereby to be viewed as given. Hence for stationary 
currents, we have the same mathematical problem as for E or ® in electrostatics, but 
with the conductivity ø instead of the permittivity £ and with —V - j’ instead of the 
charge density p. If we can determine the capacity between two electrodes with a 
given form, then according to Ohm, the resistance between the same electrodes for 
a conductor satisfies 


3.2 Stationary Currents and Magnetostatics 189 


because from J = f, df - oE = (o/e) f, df - D with Q = CU and U = RI. In par- 
ticular, Kirchhoff’s laws are obtained. The total resistance R = U/I of the various 
individual resistors R, depends on the type of connection: 


parallel connection (with J = }_,„ Zn) 


series connection (with U = bas Un) 


This is illustrated in Fig. 3.8. 


3.2.3 Lorentz Force 


Moving charges (currents) are deflected by magnetic fields. There is a force acting 
on a point charge q moving with velocity v in a magnetic field B, namely, the Lorentz 
force 


F=qvxB = F= favi xB. 


Note that, since F and v are polar vectors, B must be an axial vector. This velocity- 
dependent force was already mentioned on p. 78 and was generalized to the concept 
of potential energy (p. 98) and momentum (p. 100). Here then the acceleration is 
perpendicular to the velocity and the kinetic energy is therefore conserved. If we 
write V = @ x v, we have w = — (q/m) B for the cyclotron frequency. For fixed B, 
we find a helical orbit with the Darboux vector w/v, and in particular with v L B, a 
circular orbit of radius R = v/@, because only then do the Lorentz force mwv and 
the centrifugal force mv?/R cancel each other. 

However, there is no force acting on stationary currents in the homogeneous 
magnetic field, because for V - j = 0, according to p. 17 or Problem 3.4, we also 


190 3 Electromagnetism 
have f dV j = 0. Nevertheless we can measure this magnetic field, if we use the 
torque on a current loop. This will now be shown for very small conductor loops, 
since then a homogeneous magnetic field may be assumed. 

For the torque, we require the volume integral of r x (j x B) = B-rj—r-jB. 
Here we consider a little box around the conductor loop without current at its surface. 
Then, since 2r - j = j - Vr? = V - (r?j) — r? V -j for a solenoidal current density, 
the volume integral of r - j vanishes according to Gauss’s theorem, and for 

2B-rj={B-rj+B-jr}+(rxj) xB, 
the volume integral of the curly bracket vanishes as well, because we have 


rije + jire =): V (reri) = V - (reri) reri Voj, 


and therefore the same procedure as above is applicable, with 7,7; instead of r?. 
Therefore, the homogeneous magnetic field B exerts a torque 


N=} fav exi xB 


on the conductor loop. Hence the magnetic field can be determined and the concept 
of magnetic moment introduced. 


3.2.4 Magnetic Moments 


The last equation suggests introducing the magnetic moment of the conductor loop 
(or more precisely, its dipole moment) 


m = i favrxi, 


This is an axial vector. For a current of strength 7 around a plane sheet A, it has 
magnitude (see Fig. 2.4) 


m=}y|frxa|= r4, 
and points in the direction of the normal to the loop, in such a way that the current 


direction forms a right-hand screw around this axis. 
Such a magnetic moment in a homogeneous magnetic field B experiences a torque 


N=m~xB, 


3.2 Stationary Currents and Magnetostatics 191 


as was shown before. (Instead of this, N = uom x B/uo is often used and uom is 
called the magnetic moment, whence for m, the factor uo is included—but this idea 
goes contrary to the IUPAP recommendation.) 

If the current originates from a charge Q of mass M (both evenly distributed, 
so that p/Q = py /M) distributed along the closed orbit, the magnetic moment is 
related to the orbital angular momentum L by m = 5 fdVr x py = 5(Q/M) L. 
In atomic physics, the action quantum fi is taken as a unit for the orbital angular 
momentum, along with the charge and mass of an electron (Q = —e and M = me). 
Therefore, in that context, the magnetic moment is related to the Bohr magneton (see 
p. 623) 


eh 


2me 


HB = 


In atomic physics, magnetic moments are usually denoted by u, but in macroscopic 
electromagnetism, this is already reserved for the permeability. 

While electric dipole moments can be formed from monopoles, magnetic mono- 
poles have not yet been observed. Such a thing would have to be a pseudo-scalar, 
because m is an axial vector. Since Dirac, it has not been excluded that there may be 
magnetic monopoles in elementary particle physics—it may just be that they have 
not yet been separated. In any case, all our macroscopic considerations work without 
magnetic monopoles. 


3.2.5 Magnetization 


As Fig. 3.9 shows, we can replace macroscopic current loops by many microscopic 
ones, and these then by magnetic moments, if we deal with the action of a magnetic 
field. It is therefore useful to introduce the density of magnetic moments on the 
surface or in the volume. If there are N magnetic moments m; in the volume AV, 
then 


1A 
A 


is the associated magnetization, an axial vector like m or r x v. (Here, many solid- 
state physicists include the factor zo for the magnetic moment. This goes against the 
IUPAP recommendation.) It is sometimes also called the (magnetic) polarization. 
As shown in the right-hand part of Fig. 3.9, M has curls, where the current density 
does not vanish. In fact, we find V x M = j, because if d is the distance between 
different current-loop planes, then in addition to m = I - A, we also have m = M - 
d - A. The magnetization clearly has a discontinuity of I /d, i.e., by the surface current 
density j,, at the current-carrying surface, whence we arrive at V x M =j. 


192 3 Electromagnetism 
e o œ 

> > > 

> > > 

> > > 


> > >> 


> > > 


Fig. 3.9 Left: A macroscopic current loop (in the y, z-plane) is divided into 4x5 “microscopic” 
ones, each of which represents a magnetic moment. Right: A cuboid with three such planes with 
magnetic moments (arrows) is cut open (in the x, z-plane) and a current loop is associated with 
each one. The intersection points of the currents are indicated by the black dots 


In atoms there are magnetic moments but no electric conduction currents, as can 
be verified in a magnetic field. Therefore, for the behavior in magnetic fields, we 
have to account for the magnetization in addition to macroscopic electric currents, 
and in microscopic electromagnetism, we have to introduce a “microscopic current 
density” 


j=j+VxM. 


(For the magnetic moments of elementary particles this is not justified, however, 
because their moment is connected to the spin and cannot be derived from a molecular 
current.) Since j differs from j only by a rotational field, j is solenoidal like jJ. 

Later we shall stick to macroscopic electromagnetism and always take only the 
macroscopic current density j (leaving out the bar), but for the time being, j will be 
the microscopic current density. 


3.2.6 Magnetic Fields 


Even if Sect. 3.2.3 has already shown that a field B can be measured (by forces acting 
on magnetic moments or moving charges), we still have to deal with its generation: 
magnetic fields occur for magnetic moments as well as for electric currents. 

Since there are no magnetic monopoles, the magnetic field B is solenoidal. In 
addition, we find from experiment that each microscopic current density is related 
to the circulation density of a magnetic field: 


V-B=0 and VxB=poj=uoG+VxM), 


3.2 Stationary Currents and Magnetostatics 193 


since we have the Biot—Savart law 


na 
B(r) = mvx fav i) __ 0 fay jay xv—_. 

Ir- r'| 4n Ir -r'| 
V acts only on r and not on r’, and hence we have V x Gj’ = —j’ x VG. For 
sufficiently thin conductors, it follows that 

I’ T 1 
Bor) = vx f= E [exv 
Ir — r'| 4r Ir- r'| 


For a given magnetization, V x M may of course appear instead of j. 
Since we have V x B = uo j = no (j + V x M) in the macroscopic theory, we 
set 


B = uw (H+M), VxH=j, 


where H is called the excitation or magnetic field strength and B is referred to as 
the displacement field of the magnetic field or magnetic induction. Since B is a 
measure of the force on moving charges, it should actually be called the magnetic 
field strength, but if we compare electrostatics and magnetostatics, the choice of 
names is understandable, as we shall now show. 

In magnetostatics, we deal with magnetized matter without electric currents, 
whence j = 0. Because V - B = 0 and B = uo (H + M), we clearly then have 


VxH=0 and V-H=-V-M. 
This is reminiscent of V x E = 0 and V - E = —e9_'V - P for uncharged polarized 
matter in electrostatics. Since the excitation H is irrotational here, we may likewise 
introduce a scalar magnetic potential ®m by 


H=-—V®,,, 


where (see p. 174) 


be -> fa av’ VMT) | _ 1 v- fav’ M(r’) 


r-r] T r-r]|` 
r] 4 / 


The magnetic potential cannot be connected to a potential energy though (there are 
no magnetic monopoles), and it is a pseudo-scalar. 

A tiny rod magnet with moment m’ at the position r’ thus produces the magnetic 
field (see p. 172 and Problem 3.8) 


194 3 Electromagnetism 


— —— SSSA 
ee — 
——— 
Pa » & cereal 
LL.  & @ ———— 
—— 
D&A 
sce AEN —S> 


SSN 


WF WY 


Fig. 3.10 Field lines of a permanent homogeneous magnetized cylinder. Left: H field. Right: B 
field. Except for the edges, the flux through the surface increases by one unit from line to line. The 
right-hand figure applies also to the H and B field of a current-carrying coil 


à 


1 m 
H(r) = -Von = — V(V- ) 
An Ir—r| 


1 3m’-ee—m 


= —— é6(r—-r’), with e= 
4r jr — r'|’ 3 ( ) 


, , r-r 


Ir—r’ 
This is related to the magnetic induction field 
B(r) = MoH + po w ô(r = r’). 


Incidentally, this can also be written as uo/(47) V x (V x m’/|r — r'|), because 
V x(V x a) = V(V -a) — Aa, and according to p. 26, Aļr — r'|! = —4rô 
r-r’). 

For a homogeneous magnetized cylinder (in air) with the curved surface along M, 
the magnetization has sources only on the faces and curls only on the curved surface. 
They jump there from M to zero. Therefore, on the faces V4 -M = —n- M, and 
on the curved surface V4 x M = —n x M. The potential Pm of a circular face 
can be expressed with the help of a complete elliptic integral of the first kind (see 
Problem 3.17), and that of a homogeneous circular disk as an integral of it. If we 
have calculated ®m on the edge and on the faces, then the remaining values follow 
faster numerically via the Laplace equation (see [3]). Outside the cylinder, the two 
fields noH and B are equal, because M = 0, while on the axis inside they are directed 
oppositely (see Fig. 3.10). 


3.2 Stationary Currents and Magnetostatics 195 


3.2.7 Basic Equations of Macroscopic Magnetostatics 
with Stationary Currents 


Once again we allow for electric currents j and consider the basic equations of 
macroscopic magnetostatics with stationary currents derived at the beginning of the 
last section: 


V-B=0, VxH=j, and B=uwH. 
These differential equations supply the boundary conditions 
n-(B, —B_)=0 and nx (H, —H_)=jy, 


and read in integral form 


f at-Beo and i dr-H=/. 
(V) (A) 


Here j4 denotes the macroscopic current density at the surface. It vanishes normally. 
It occurs only for superconductors of the first kind: then there is no magnetic field in 
the interior (Meissner—Ochsenfeld effect), and only the surface carries a current. The 
last equation is called Ampére’s circuital law, and also in earlier years the Ørsted 
law. It relates the magnetic field and current strength in a particularly simple way 
and also contains the right-hand rule: the magnetic field encircles the current I = Aj 
anticlockwise. 

For example, for a straight normal conductor wire of circular cross-section with 
radius Ro and constant current density j = I/(z Ro”), the magnetic field in cylindrical 
coordinates R, g, z about the wire axis is given by 


IxR 
1 E for R< Ro, 
H= — Ro 
Qn IxR 
for R > Ro. 
R2 


The right-hand rule requires H to be proportional to I x R (up to a positive factor), 
and Ampeére’s law fixes the absolute value. We have 27 R H(R) equal to J (R/ Ro)? 
for R < Ro and equal to J for R > Rọ. Of course, there is no arbitrarily long straight 
wire—therefore the realistic magnetic fields of stationary currents also decay at large 
distances more quickly than R~!, in fact, like a dipole field as (R? + 27)~7/?. 

In general, an applied magnetic field magnetizes a (magnetic) medium because it 
polarizes irregularly oriented moments. We therefore write 


M= Xn H and B= po + Xn)H= HH, 


3 Electromagnetism 


s 
a 


196 
=== i ' === 
ee Hi < Ha E ŘŘŘŘŘ 
H B 
E 


Fig. 3.11 Magnetic spheres. Upper: Paramagnetic. Lower: Diamagnetic. The sphere is brought 
into a homogeneous magnetic field. Left: H field. Right: B field. Both are axially symmetric, and 
we find V - B = 0 and B = wH in addition to V x H = 0. The lower figure is always useful if the 
permeability inside is lower than outside, even if there is no diamagnet (in air). The figures are also 
valid for electric field lines for different permittivities and for current lines of stationary currents for 
different conductivities, as explained in the text: H — E and either u —> £ and B > D or u > o 
and B > j 


with permeability u (sometimes expressed as relative permeability 4, = u /uo) and 
magnetic susceptibility Xm = Ur — 1. These are tensors, if there are preferential direc- 
tions: B and H may have different directions. For a ferromagnet, B and H are not 
related to each other linearly. This is often represented in a hysteresis curve M (H). (In 
a weak field typical values for these are u; ~ 500.) For materials with smaller scalar 
permeability, we also distinguish between paramagnets with Xm > 0 or u, > 1 and 
diamagnets with xm < 0 or O < u, < 1. The dielectric susceptibility Xe is always 
positive: paramagnetism can be explained as orientation of dipole moments, dia- 
magnetism as a consequence of Lenz’s law, which will be dealt with only in the next 
section. 

If we consider a magnetic sphere (radius rọ) in a homogeneous magnetic field 
Hp (at long range), in addition to V x H = 0 and because V - B = 0 = V - uH, we 
have min - H; = uan - H,. The magnetic field is irrotational and only has sources 
on the surface. The associated discontinuity is related to the field of a dipole m = 
(ui — Ma) / (Mi + 2Ha) ro Ho (except for a factor of 47r), because with 


3m-ee—m 


m 
Hi = Ho - — and H, = Ho + 3 Fi 
r 


r 
for e=-, 
ro r 


all the above-mentioned conditions are satisfied. This result is illustrated in Fig. 3.11, 
where the pictures are also valid for electric field lines for different permittivities 
(because V - D = 0, D = eE, V x E = 0) and for current lines of stationary currents 
for different conductivities (because V -j = 0, j = oE, V x E = 0). To this end, 
we replace H — E and either u — £ and B > D or u > o and B => j (see also 
Problem 3.23). 


3.2 Stationary Currents and Magnetostatics 197 


3.2.8 Vector Potential 


The displacement field B is always solenoidal and therefore a rotational field: 
B=VxA. 


A is called the vector potential and is a polar vector field because the induction is 
an axial field. Here the induction field B can be measured via the Lorentz force or 
by its action on magnetic moments, while the vector potential A represents only 
a computational tool and is not unique—only its curl is physically fixed, not its 
sources (and an additive constant). Therefore, a gradient field may also be added: 
A’ = A — VW would supply the same magnetic field as A. The vector potential must 
therefore be gauged, and in this case V - A is fixed along with a constant additive 
term (in most cases we require it to vanish for r —> 00). For the Coulomb gauge, the 
vector potential is chosen solenoidal. 
The equation V x B = uoj does not depend on the gauge, but 


AA = — mwj 


does, since we only have AA = —V x B for a solenoidal vector potential given that 
AA = V (V - A) — V x(V x A). On the other hand, AA = — uoj Gif A — 0 for 
r — œ holds), and according to p. 27, 


‘ / ot 
4r Ir- r'| ae —r'| 


which yields the Biot-Savart law (see p. 193). For stationary currents, this vector 
potential is solenoidal, because for its source density we require j’ - VG, which can be 
rephrased as G V’ - j' — V’ - Gj’ since VG = —V’G. According to Gauss’s theorem, 
we then only require a surface where there is no current to prove the statement. 
Here j is still the microscopic current density, and can also appear as the circulation 
density of a magnetization. In this case, we have G j} = G V’ x M’ = V' x GM’ + 


M’ x V’G. Therefore, the vector potential of a magnetic moment m’ results in 


1 m 
Am=- m xV Ly , 
4r Ir—r’| 47 Ir—r’| 


because the surface integral of GM’ does not contribute. 

For a homogeneous displacement field B, we may set A(r) = 5B x r, because 
then V x A = B, and the Coulomb gauge holds everywhere. Then the origin of r 
may be chosen arbitrarily—a constant is irrelevant. For other fields it is fixed by the 
condition A = 0 for r —> on, which is not suitable for a homogeneous field. 

The integral mentioned at the beginning does not need to be taken over the whole 
space, if we also take into account surface integrals (as for the scalar potential on 


198 3 Electromagnetism 


p. 169). We use the equation AA = —joj and Green’s second theorem, i.e., in 
ty dV (WAd¢d — dA) = Sv df - (YV — oV y), we replace the function y by 
|r — r’|~! and the function ¢ by the three components of the vector potential. It then 
follows that 


© a 
An Aw) = f av’ none) 
yí Ir — r'| 


df’. V’ A(r' 1 
+f <a (ar v’ =) AC). 
wy) dlr-r'| v Ir-r] 


In particular, we may choose V’ such that there is no current: then the vector potential 
is fixed by its values and its first derivatives on the surface (V’). As in electrostatics 
(see the end of Sect. 3.1.3), then also in magnetostatics in a finite region, the same 
physical field can be generated in various ways (by distributions in space or on 
sheets). The continuation across the boundaries is not unique and allows various 
models. This has also been clearly demonstrated in the context of Fig. 3.9. 


3.2.9 Magnetic Interaction 


An inhomogeneous magnetic field exerts a force on a magnetic moment. If we use 
the equation tay dr x B= f,(df x V) x Bofp. 17 and m = JA ofp. 190, then for 
a sufficiently small conductor loop, it follows that 


Pap dex B= (mx ¥) xB. 


As for the electric dipole moment, we require likewise small extensions for the 
magnetic moment in order for the higher moments to become negligible. Here we 
may also write V (m - B) — m V - B, since the differential operator changes only B. 
Given that B is always solenoidal, we find 


F=V(m-B). 
Therefore, we may also introduce a potential energy 

Epot = -m.: B , 
and again, F = —V Epot holds. This corresponds to the expression Epot = —p- E 
in electrostatics (see p. 171). There, because of V x E = 0, we could also write 
(p - V)E instead of V (p - E). In contrast, we have V - B = 0 here, and therefore 


V (m - B) is also equal to (m x V) x B. Furthermore, p and E are polar vectors, 
while m and B are axial. 


3.2 Stationary Currents and Magnetostatics 199 


N ad § 


Fig. 3.12 Tensor force of a moment ¢ at the position o on a moment at the position e. Equal 
moments (¢), opposite moments (4), in-between the perpendicular moment — 


Hence the potential energy of two dipole moments m and m’ at positions r Æ r’ 
is obtained as 


m-m’—3m-em’-e 
Ho m- Vm’ Vv" at 


Ens = — = 
po An r-r| 4r Ir — r'|’ 


Here e = (r — r')/|r — r'|. (For the last equation, compare p. 172.) With r Æ r’, this 
yields 


1 

F = —-VE po =- m.Vm.- VV 
4r Ir — r'| 

_ 340 m-em’+m’-em-+(m-m’—S5m-em’-eje 


An |r —r’|4 


for the force acting on m. This force depends upon the directions of the three vectors 
m, m’, and e, and does not always lie in the direction of (+) e: it is not a central, but 
a tensor force (see Fig. 3.12). 

We generalize the expression for Epot to an extended magnetization: 


Eye = — fav M-B, 


The integrand can be rewritten: M - (V x A) = V- (A x M)+A-(V x M). With 
Gauss’s theorem (and no magnetization on the surface of V), and because V x M = 
j, it follows that 


200 3 Electromagnetism 
Epot = - favie -A(r). 


(In Sect. 2.3.4, and in particular p. 98, we mentioned that the generalized potential 
energy —qv-A belongs to the velocity-dependent Lorentz force acting on point 
charges. This is in accord with Epot = — f dV j- A.) Even though the vector potential 
can be re-gauged, the difference f dV j- VW = [dV {V - (Wj) — Y V - j } does not 
contribute in the case of stationary currents because of Gauss’s theorem (in finite 
current loops). 

For the interaction energy of two conductors, we thus have 


Epa = -Eff ayay I@ Ie) | 


4r Ir- r'| 


In order to derive the associated force, we have to consider the position dependence 
of this potential energy. The two current loops change only their relative positions, 
but neither their current densities nor their form. The potential energy originates from 
the fact that two current loops are brought together from a very great distance and 
that forces then appear. We should therefore introduce the average separation R of 
the two conductors and consider the double integral 


n dr’. dr’ |R +r” -= r'|! . 


The force between the two current loops then follows from F = —V g Epot as 
(Ampère’s force law) 


F= m ff avav' jie) vi. 
4r Ir—r’| 

According to this, parallel wires attract each other if electric currents flow in the same 
direction, and repel each other for currents that flow in opposite directions. In other 
words, currents of like sign are attracted, while like charges are repelled, because 
Coulomb’s law contains —co? gq’ instead of ff IT dr - dr’, something we shall be 
concerned with in the next section. Here F is the total force which the conductor 
with primed quantities exerts on the other (unprimed) one. Since V’G(r — r’) = 
—V G(r —r’), it follows that F' = —F, as is required also by Newton’s third law. 
(Current-carrying conductors do not exert a force on themselves. In this case primed 
and unprimed quantities must be interchangeable.) The factor uo/4x is connected 
with the chosen concept of current strength: 


If two parallel (straight) conductors of negligible cross-section a distance 1 m apart in vacuum 
each carry a current of 1 A, then they exert a force of 2 x 1077 N per meter length on each 
other. 


The double integral of dr - dr’ =dz dz’ is important. We may restrict ourselves to a 
conductor element dz around z = 0. If the two conductors are separated by a distance 


3.2 Stationary Currents and Magnetostatics 201 


R, then since Vir — r'|“! 5 a(R? + 2?)~!/7/AR| = R (R? + z”)-9/?, the integral 
faz’ R (R? + z)? = z! RT! (R? + z?" is to be taken from —oo to +00. We 
thus deduce the force per unit length to be 


F poll 


l  2mR 


Given the magnetic field constant uo = 4 x 1077 N/A?, we do indeed find the 
above-mentioned force from Ampére’s law. 


3.2.10 Inductance 


For the interaction energy of two thin conductor loops, we find 


Ew=-I fa- A=- L 


with the mutual inductance 


ia 


According to this, known as the Neumann formula, L is positive for currents in the 
same direction in coaxial loops. Figure 3.13 shows an example whose inductance we 
shall now calculate for radii R and R’, and distance a. 

Because |r — r'|? = a? + (R — R>) - (R — R’), we can find L from 


27 Qn = 
i= 1 re f ig [ap cos (p — g’) 
P Tet RT Rt IRR G Y) 


Fig. 3.13 Top and side view of two coaxial current loops (continuous lines) at the distance a. The 
line |r — r'| connecting two points is shown by a dashed line. It can be calculated with the help of 
the dotted lines (radii R and R’) 


202 3 Electromagnetism 


Fig. 3.14 Mutual inductance L/l uo RR’ 
L(k?) of two coaxial current : /t g 
loops (radii R and R’ at 2 


distance a) with k? = 
ARR’ /{a? + (R + R')*} 


0 , 
0.0 0.5 1.0 & 


The double integral is equal to 27r {oa + R? + R? — 2RR' cos w}—'/? cos y dy. 
If we integrate only from 0 to z, we obtain half the value. With z = 5 (x — w), it 
follows that cos y = — cos(2z) = 2 sin? z — 1 and dw = —2dz: 


"/2 9 gin? z—1 ARR’ 
L = uov RR' k -E dz, with P= x 
0 1 — k? sin? z at + (R+ R’) 


Note that here k? < 1, because we consider only separate conductor loops and we 
have 4RR’ = (R + R’)? — (R — R')}?. We thus encounter the complete elliptic inte- 
grals of first and second kind (see p. 104 and Fig. 2.33): 


m/2 d 
K(k) = / s 
0 


1 — k? sin? z 


and 
m/2 
E(k?) = f v1 -— k? sin? z dz, 
0 


Since sin? z = {1 — (1 — k? sin? z)}/k?, this implies that 
1/2 2sin?z—1 K(k?) — E(k’) 
(a0. 


re — K(k’). 
0 1 — k? sin? z k? A 


Finally, 


2 (K-E)-kK?K 
L = uov RR' Se 


The mutual inductance of two coaxial circles can thus be reduced to elliptic integrals 
(see Fig. 3.14). 

Particularly important is the special case R ~ R’ > a, i.e., k © 1, of two close 
current loops. Then the integrand of E is approximately equal to cosz, so E ~ 1 
and L ~ uov RR’ (K — 2). To calculate K for k ~ 1, a series expansion cannot be 


3.2 Stationary Currents and Magnetostatics 203 


employed, since the indefinite integral for k = | diverges as In cot(47 = $x) at the 
upper boundary. But for the incomplete elliptic integral of the first kind (see p. 103) 


p d 
Fol) = / <— , ws Fd |?) = KO), 
0 V1—k?sin?z 


there exists the ascending Landen transformation (in k?) 2zı = z-+aresin(k sin z) 
(see Problem 3.29), viz., 


F(ọ |k*) = aes F(g1 | k1” 
1+k , 


with 


2 4k and _ g + arcsin(k sin p) 
~ OFK? oan 2 


Fork? = 1—eandg = tr, we have ky” xl- Ke and pı = ir — 6g with dg = 


5 arccos y 1 — € X WA . Consequently, for the ascending transformation (in k?), the 


upper boundary g decreases, and now we may set 
2 1 1 
ky 1: FO, — ôọ | 1) = In(cot 5°?) x In(4/./e) . 


Hence for k © 1, we arrive at K ~ In(4/./1 — k?) and obtain 


4A(R+R’ 
L = poVRR (In EFA) 2), for RXR’ >a, 


Ja + R- R} 


i.e., for two nearby loops with like axis. 


3.2.11 Summary: Stationary Currents and Magnetostatics 


For electric currents, we use the current density j = pv and the current strength 7 = 
J df - j. Stationary currents are solenoidal. In the following, according to common 
practice, we write the averaged current density without the bar, since we would like 
to use only macroscopically measurable quantities anyway. 

In many cases, we have Ohm’s law in differential form 


j=—oE+j. 


Here ø is the conductivity and j’ the current density at the current sources. 


204 3 Electromagnetism 


All electric currents are accompanied by a magnetic field. Hence we can also 
identify currents in atoms, which do not contribute to macroscopic electric currents. 
They can be understood via the magnetization M or via the magnetic moment m = 
5 J aVr x j. Hence, macroscopically, 


V-B=0 and VxH=j, with B=p~o(H+M)=yH. 


Since the induction field is solenoidal, it derives from a vector potential A with 
the property B = V x A. For the Coulomb gauge (V - A = 0) and using V x B= 
uo G + V x M), we have 


ja)+V'xM 
At) = m fav uoo a 
4r Ir -r'| 


The magnetic field acts on a moving charge via the Lorentz force F = f dV j x B. 
The force between two conductors with stationary currents is then given by the 
Ampère law 


1 
r= [favav'jm jeva. 
4r Ir -r'| 


Currents with like orientation in parallel conductors attract each other. 


3.3 Electromagnetic Field 


3.3.1 Charge Conservation and Maxwell’s Displacement 
Current 


The charge conservation law was expressed on p. 187 in the form of a continuity 
equation, viz., 


dp/at +V-j=0. 


Conversely, the continuity equation ensures charge conservation. Since p = V - D, 
we thus also have 


or according to Gauss’s theorem, 


aD dQ 
0= df. (j + —)=/+—. 
J, Gta) +a 


3.3 Electromagnetic Field 205 


As long as, e.g., the charge on the anode of a capacitor increases, a current will also 
flow, with a sink for the current density j. If we connect the current loop with the 
capacitor in a Gedanken experiment, an electric current will flow in the conductor, 
while Maxwell’s displacement current will flow in a non-conductor, with current 
density 0D/dt. If an electric field changes with time, then this is the corresponding 
current. 

The sum of the conduction and displacement current densities is solenoidal, and 
hence is a rotational field. For stationary currents, it is the curl of the magnetic field 
H—but this is in fact generally true: 


. oD d 
VxH=j+— =} d- H=1+Ẹ f at. 
ot (A) dt Ja 


While a capacitor is being charged, there is thus a magnetic field around it, not only 
around the connecting wires. For the path integral f dr - H, only the boundary of the 
area A is of interest. If we choose two different sheets with the same boundary (A) 
for f df - D, then the values of the surface integrals differ by the charge Q enclosed 
by these two sheets. In fact, J + Q then no longer depends on the chosen area. 

In insulators there is no conduction current, but at most a displacement current, 
while in conductors the displacement current is in most cases negligible compared 
to the electric current. If we have a periodic process with angular frequency w, 
then for j/ D this clearly depends on the ratio o/ew. Here most conductors have 
o/e > 100 THz. Therefore, the order of magnitude of the ratio o/ew is only unity 
for frequencies common in optics. 

As long as the displacement is negligible compared to the electric current, the 
currents are said to be quasi-static—for stationary currents all derivatives with respect 
to time vanish. 


3.3.2 Faraday Induction Law and Lenz’s Rule 


As was just shown, the two equations V-D = p and V x H=j+0D/ot ensure 
charge conservation. If there were no free charges but only electric dipoles, we would 
have instead V -D = Oand dD /dt = V x H. This is noteworthy insofar as we would 
not find magnetic charges, but only magnetic dipoles—whence we already set up the 
equation V - B = 0 in magnetostatics. Hence we can ask the question whether 0B/dt 
is equal to the circulation density of a (polar) vector field, in particular, a vector field 
which is irrotational for time-independent phenomena. 
In fact, we have the Faraday induction law, 


206 3 Electromagnetism 


Fig. 3.15 Lenz’s rule. The time-dependence of the magnetic field 0B/dt = —V x E induces a 
current density j = o E in the conductor loop. This current is accompanied by a magnetic field 
curl density V x H = j such that, on the plane of the loop, H and 0B/0dt are oriented in opposite 
directions 


A time-dependent magnetic field and the curl of the electric field are related: the 
magnetic field induces an electric current in a conductor loop. Every dynamo makes 
use of this. The sign in the induction law supplies the important Lenz rule (see 
Fig. 3.15): the induced current works against its cause. 

In integral form, the induction law reads 


d 
(A) dt A 


Since V - B = 0, the last expression depends only on the boundary of the area A. 
The left contour integral is called the circulation voltage or induction voltage. We 
note that the concept of electric voltage between two points introduced previously 
(p. 169) can now yield different values depending on the path in-between. 


3.3.3 Maxwell’s Equations 


Now we have prepared sufficiently for the famous Maxwell equations, with which 
we can describe many phenomena of electricity and optics—including also D = £E 
and B = uH: 


VxE=-—, V-B=0, 
dD 


V-D=o:—, os ee Fe 


These couple the electric and magnetic fields. It is thus better to speak of the total 
electromagnetic field. As integral equations, they read 


3.3 Electromagnetic Field 207 


d 
f dr- E -5f aB, f df - B 
(A) dt Ja (V) 


d 
df-D=0Q, dr- H 1+4 fan. 
v) (A) dt J4 


because J y dV p = Q and f a df - j = I. The boundary conditions for the transition 
at an interface are similar to those in the static case: 


n x (E, —E_)=0, n - (B4 —B_)=0, 
n- (D4 — D) = p4 , n x (H, — H_) =j, 


In particular, there is no field B or D whose derivative with respect to time on the 
interface is singular like a delta function. There is at most a discontinuity like a 
step function. Its source density or circulation density may be singular like a delta 
function, but because 5(x) = e’(x), there is only a finite discontinuity in the field, 
not an infinite one as for the delta function. Therefore, the derivatives of B and D 
with respect to time do not contribute to the surface curl density. 

Clearly, the curl of the electric and the magnetic field are connected with time- 
dependent changes, while their sources are already known from statics. Therefore, 
in statics E and H, or D and B, are similar. But for time-dependent phenomena on 
the one hand E and B are connected, and on the other hand D and H are connected. 

All Maxwell’s equations were already known prior to Maxwell, except for the one 
involving the displacement current, but it is only by virtue of the latter that certain 
key phenomena such as charge conservation and electromagnetic waves can exist. 

According to the Fourier transform r — k (see p. 22), 


E(t, r) = ae [ox exp(+ik - r) E(t, k) , 
= 

E(t, k) = ae fe exp(—ik - r) E(t, r), 
=i 


and correspondingly for D, B, H, j, and p, Maxwell ’s equations read 


, aB(t, k) 
ik x E(t, k) = -— >, ik- B(t,k) = 0, 

: : ; əD(t, K) 
ik - D(t, k) = p(t,k), kxHv k eC a ame 


and the continuity equation 


et POD ik. -j(t,k)=0. 


The real differential expressions in real space thus become complex in k-space, but 
local expressions for the transverse and longitudinal parts of the field. In particular, 
the induction field is purely transverse: 


208 3 Electromagnetism 


ð B trans 


V x Erans = — at 


> Biong =0. 
In addition, V - Diong = p holds, and we can split up the fourth of Maxwell ’s equa- 
tions: 


ODirans . IDiong 
a and = 0 = jong + Tik 


V x Hirans = Jian KJ 


With V - Diong = p, the last equation leads to the continuity equation. 

The fields are real in real space and, according to p. 22, have the symme- 
try E(t, k) = E* (t, —k), and likewise for D, B, H, j, and p. In particular, for 
a point charge p(t, r) =g6(r —r’) has (complex) Fourier transform p(t, K) = 
(2x)~?/?q exp(—ik - r’). 

We derived the microscopic Maxwell equations from the “facts of observation”. 
There are electric, but no magnetic charges; charges remain conserved; we find the 
force law due to Coulomb, the one due to Ampère (Lorentz), and also Faraday’s 
induction law. The “macroscopic” Maxwell equations start from 


D=eE+P=cE and B= uo (H +M) = uH, 


with averaged charge and current densities, the polarization P, and the magnetiza- 
tion M. Actually, we should have written H = B/j1o — M = B/n for the magnetic 
excitation, since E and B are related, and likewise D and H. 

In the following we shall always assume linear relations between D and E and/or 
H and B, even though there are also “nonlinear effects”, e.g., for hysteresis and for 
strong fields of the kind occurring in laser light. In addition, we calculate only with 
scalar relations—this is generally not allowed in crystal physics, where ¢ and u are 
tensors. But even there, many phenomena can already be treated, and the calculations 
are then simple. 

In addition, we have to observe Ohm’s law: 


j=oE o V=RI. 


To a first approximation, the conductivity o and the resistance R do not depend on 
the applied field. (Here o is actually a tensor.) 


3.3.4 Time-Dependent Potentials 


As long as the fields do not depend on time, they can be derived from the scalar 
potential ® and the vector potential A, as was shown in Sects. 3.1.3 and 3.2.8. 
This works even for time-dependent fields. The induction field in particular remains 
solenoidal, and therefore can still be derived from the curl of a vector potential: 


3.3 Electromagnetic Field 209 


V-B=0 <> B=VxA. 


However, for time-dependent magnetic fields the electric field E has curls, and a 
gradient field (— V ®) is no longer sufficient, but since according to the last equation 
we have 0B/dt = V x 0A/dt, the induction law V x E = —0B/dt now implies 


E=-V®- —. 
ot 
With the two quantities ® and A (which have four components in total), we can thus 
determine the two vector fields E and B (with six components in total). It remains 
only to comply only with the two remaining Maxwell equations (where we assume 


D = £E and B = uH with constant factors £ and u, i.e., homogeneous matter). Since 
A® = V . VỌ, it follows that 


and since AA = V (V - A) — V x (V x A), 


a? a® 
A—eu—5)A=—nj+V(V-At+en—). 
( aye wc a7 
We do not use j = oE, since here p and j are viewed as given. The potentials are 
not unique though, since the source of the vector potential has not yet been given. 
The magnetic field does not depend on it, and its influence on the electric field can 
be counteracted by a change in the scalar potential. Therefore, despite the gauge 


transformation 


ow 
eee and A'’=A-VW, 


with continuously differentiable Y, the same fields E and B result. Physical quantities 
do not depend on the gauge. The curl of the vector potential determines the magnetic 
field, and the sources determine AW. In the static case we were allowed to choose 
these sources arbitrarily, but now their time dependence shows up for the scalar 
potential. Every gauge transformation changes the longitudinal component of the 
vector potential and the scalar potential. Then it is clear that 


OAtong 


Ejong =—-VO— at > Biong =0, 
JA rans 
Erans = eee , Beans =Vx Atrans . 
ot 


Longitudinal fields are irrotational, transverse ones solenoidal. 


210 3 Electromagnetism 


There are two possibilities for the gauge such that the equations for the scalar and 
the vector potential decouple. This can be seen immediately for the Lorentz gauge 


ad 
V-A+t+eu — =0, 
Yat 
and in particular, 
2 2 


poe, Ges. 


These formally similar equations will be preferred in the next section on Lorentz 
invariance. There is a retardation effect here: p and j are important at time t’ = t — 
|r — r’|/c, showing that actions propagate with finite velocity. This will be explained 
in more detail in Sect. 3.5.1. 

But for the moment we prefer to take the Coulomb gauge (radiation gauge, trans- 
verse gauge) 


V-A=0. 


Even though initially this yields 


p a? , a® 
Ara. (à Ja = y ) , 
E ( ae ar? H(i £ or 


according to p. 27, the Poisson equation A® = — p/e is solved by 


t r’) 
Sie = a fev y a 
and with the continuity equation do/dt = —V -j therefore leads to 
a® 1 V’-j(t, r’ 
_ J gy WIG) 
ot Ame Ir—r’| 


Thus according to p. 25, eVd®/dt comprises the part of the current density that 
originates in the sources, and therefore j — ¢V0®/dt is the solenoidal (transverse) 
current density 


Vv’ . t, r’ 
Jirans t. r) = Vx fav V xj& r’) . 
4n |r —r’| 


Consequently, the system of equations is also decoupled in the Coulomb gauge: 


3.3 Electromagnetic Field 211 


For this gauge, only a solenoidal current density is therefore of interest. This occurs, 
in particular, if there are no macroscopic charges (then even ® = 0 holds), e.g., for 
the radiation field of single atoms. Therefore, it is sometimes called the radiation 
gauge. However, it does have a disadvantage: for each Lorentz transformation, a new 
gauge must be derived, because it is not Lorentz invariant. 


3.3.5 Poynting’s Theorem 


The Maxwell equations imply in particular 


aD OB : . 

Porr TH oy T Eey ea a E yE 
We recognize the expression j - E from p. 188 as the power density for the Joule heat, 
which does not arise in insulators. The power densities of the electric and magnetic 
fields are given on the left. If D and E are related to each other linearly, then the 
first term is the time-derivative of the known energy density ŻE - D of the electric 
field. If we also assume a linear relation between H and B (which is not allowed for 
ferromagnets because of hysteresis), then we may take the expression +H - B as the 
energy density of the magnetic field. It is positive-definite and is suggested in view 
of the similarity between the electric and magnetic field quantities. Thus we take 


w=1(E-D+H-B) 
as the energy density of a electromagnetic field and obtain Poynting’s theorem: 


dw . 

—+V-(xHW=-j-E. 

at 
If the Joule heat is missing, then this equation is similar to the continuity equation: 
E x His the energy flux density, which is also called the Poynting vector: 


S=ExH. 


In order to understand what it means for the stationary situation (with dw/dt = 0), 
we consider a finite piece of a conductor in Fig. 3.16. Here we have V -S = —j-E= 
—o E?. 

Because B = V x AandH: (V x A)=V.- (Ax H)+A-: (V x BH) forquasi- 
stationary currents (i.e., for V x H = j and no contribution from the surface integrals 
of A x H), the now-justified ansatz 5H - B for the energy density of the magnetic 
field leads to 


212 3 Electromagnetism 


Fig. 3.16 Interpretation of the Poynting vector S for a stationary current along a wire of length / 
with radius R. Here S flows from the outside through the curved surface A and, because E = U/I 
and H = [/(27 R), it has the absolute value S = U I/A there. The heat power U I generated inside 
then flows out through the curved surface, while the current flows through the faces 


tfavi-p=tfavj-asirfar-a=$er, 


where L is now the self-inductance of the conductor. According to the Neumann 


formula 
dr - dr’ 
self irri 
it can be determined, but no arbitrarily thin conductors can be taken, otherwise 
L diverges according to p. 203. We would then have to integrate over the mutual 
inductances of the various current lines (Problem 3.30). 
For the energy of two stationary currents, we derived the expression Epot = 
— fdVj-A on p. 199. Despite the other sign, this does not contradict the value 
just found for the self-energy. In the previous case, the current distributions were 
given and the mutual position and orientation of the loops were changed for fixed 
current density, while now it is the geometrical situation that is kept fixed and the 
current strength increases from zero to the final value. 
The energy of the electromagnetic field in thermodynamics is a “free energy”. 
It can be fully used for work—more on that in Sect. 6.4.8. There, too, all energies 
will be split into products of intensive and extensive quantities, which disproves 
the microscopically suggested expression F(E > + B?/uo). Thermodynamically, 
D and H must appear in addition to E and B. 
For static problems we left out integrals of the form 


[ovv-s= df-S, 
v v) 


3.3 Electromagnetic Field 213 


if integrations with boundaries at infinity were to be performed, since we assumed 
that the integrand would decrease more strongly at infinity than r~*: in fact, E at 
least as r~? and H at least as r~? (monopole or dipole field). But for time-dependent 
situations, E and H then decrease rather slowly with the distance from the radiation 
source, whence the surface integral f df -S does not vanish even for very large 
volumes—we must still account for the radiation power, which we will only consider 
in Sect. 3.3.7. 


3.3.6 Oscillating Circuits 


If we connect a resistance R, an inductance L, and a capacity C in series to an 
AC voltage U, then the energy appears in three forms: in the resistance according to 
p. 188 as Joule heat f R I 2 dt, in the inductance as magnetic energy A L I’, andin the 
capacity as electric energy 5 Q7/C. All three together must be supplied to the setup. 
We neglect the radiation power, which increases according to p. 264 as the fourth 
power of the frequency and barely contributes for quasi-stationary situations. Since 
Ò = —1, the total power is then J (RJ + LI-@Q /C). The expression in brackets 
must be equal to the applied voltage. The derivative with respect to time yields 


Tip lw 
dt? d C dt’ 


which is the differential equation of a forced damped oscillation, as in Sect. 2.3.8. 
There the decay coefficient y = +R /L and the angular frequency wọ = 1/./LC were 
introduced, and it was shown that the initial eigenoscillation decays with time and 
that the solution then oscillates with the angular frequency w of the source of the 
voltage. Therefore, we calculate in the final state, with 


U =Re{Wexp(—iat)} and J=Re{.4% exp(—iat)} . 


The ansatz exp(+iwf) is often made, and this leads to the opposite sign of i in the 
following equations. For our choice, which is also common in quantum theory, its 
value moves clockwise in the complex plane. % and Z do not depend on time. 
In the course of time, their products with exp(—iwt) become purely real as well 
as also purely imaginary. Hence the differential equation leads to (—w*L — iwR + 
C7!) ¥ = —iw Y and then Ohm’s law for AC currents, viz., 


1 
U = X I , with impedance Z=R+i(-5-0L)=R+iX. 
w 


It is composed of the active resistance R and the reactance X. The imaginary part 
shifts the phase between the voltage and current by ¢ = arctan X/R. The build-up 
of the electromagnetic field takes time—in the capacitor the voltage follows the 
current, while it precedes the current in the coil (see Fig. 3.17). Therefore, |ġ| < ix 


214 3 Electromagnetism 


|Z|/R oZ) 


L 8 = 

R 3 a 0 

E 0 g 0 15 wwo 

0.5 1.0 1.5 w/wo 5 
Fig. 3.17 Absorption circuit. Resonance for wọ = 1//V LC. Here, wọL = 5R 

|Z] / wi L ọl Z) 
1 2 
: 0 
0 i 0 15 w/w 


0.5 1.0 1.5w/wo 


Fig. 3.18 Trap circuit. Resonance occurs for y wọ? — (R/L)*. Note that here woL = 5R 


holds here, in contrast to the forced oscillation in Sect. 2.3.8 (see Fig. 2.23). For 
low frequencies (w < 1/LC), it is determined mainly by the capacity, and for 
high frequencies by the inductance. (R does not depend on the frequency, as long 
as the conductivity does not depend on it, and it determines the power loss.) For 
w = w = 1/ »/LC, the reactance vanishes, and therefore the absolute value of the 
impedance, the fictitious resistance Z = |Z |, is particularly small. 

Corresponding to Kirchhoff’s laws, we have added here the individual contri- 
butions of the three parts of the conductor. For parallel connection of a capaci- 
tor (capacity C) and a coil (inductance L and resistance R), we have in contrast 
Z! = (R—iwL)~! — iwC (see Fig. 3.18): 


(R/woL) + i (w/o) {(@/@o)” — 1 + (R/@oL)’} 


Bs Oh Riol oot (Cafe)? IP 


The fictitious resistance is now highest for w = yœ? — (R/L)?, where it is equal to 
(woL)?/ R = L/(RC). Therefore, we also refer to such connections as trap circuits 
(and, if connected in series, as absorption circuits). 
3.3.7 Momentum of the Radiation Field 


With the force density pE + j x B, Maxwell’s equations read 


aD 
pE+jxB=V-DE+(VxH-—)xB. 


3.3 Electromagnetic Field 215 


Here, the last vector product can be rewritten 


dD d(D x B oB oD xB 
__2DxB) p, ?B__ a xB) 


Bx = 
ot ot ot ot 


Dx (V xE). 
Because V - B = 0, we therefore have 


d 
r+ <2 favoxe= fav ev-p-px (V x E) 
+HV-B-Bx (V xH}. 


We restrict ourselves to homogeneous matter, but allow also for anisotropic, prefer- 
ential directions—then the permittivity and the permeability are tensors, and oblique 
coordinates can be useful, although at least rectilinear ones. According to p. 184, for 
homogeneous matter we have 


; L 
EV-D-Dx(VxE)=) g, a(E' DY — 58"E - D)/ðx" , 
ik 


and likewise with H, B instead of E, D. Therefore, we now generalize Maxwell’s 
stress tensor from p. 184 to include magnetic field contributions (it is symmetric only 
for isotropic media): 


and according to Gauss’s theorem and Sects. 1.2.4 and 1.2.5, obtain for time- 
dependent fields 


d , 
P+ of avpxB+ | df, T* =0. 
dt 2 v) 


According to this, we have to view D x B as a momentum density. For isotropic 
media, it is equal to uS and then has the same direction as the energy flux density 
S, but a different one for anisotropic media. 


3.3.8 Propagation of Waves in Insulators 


In insulators, i.e., if o and j vanish, and for constant £ and u, we have 


(en Ž-A) a. r)=0, 


216 3 Electromagnetism 


according to Sect. 3.3.4 (see in particular p. 210), and this for both the Lorentz and 
the Coulomb gauge. This (homogeneous) wave equation for a vector field is also 
encountered for the electric and magnetic fields. In particular, in the insulator, 


OB oD 
VES. V-B=0, V-D=0, and ee Pe 


Hence, since Aa = V (V - a) — V x (V x a) for D = cE and B = uH, we have 


AE y oB 0 VxH 3E 
e- X Å — « = ss 
ot a ot ate ar 
oD 0 3?B 
AB = V = VxD= —. 
pene ot Hr > ae or 


According to these wave equations, we find the phase velocity c from the permittivity 
£ and permeability u : 


Eu = c?, in particular in vacuum £go = cp 3 


This is Weber’s equation. In electromagnetism, in contrast to (non-relativistic) 
mechanics where all velocities are on an equal footing, a particular velocity is sin- 
gled out. This is connected with the question of Lorentz invariance, discussed in the 
next section. If it is taken as an observational fact (Michelson experiment), charge 
conservation and Coulomb’s law from the microscopic Maxwell equations can be 
derived from it, even without knowing anything about the magnetic field. However, 
the charge and magnetic moment of elementary particles are not properties on an 
equal footing. 

The wave equation is a homogeneous partial differential equation of second order. 
In order to solve it, we take the Fourier transform (see Sect. 1.1.11) A(t,r) > 
A(t, k). Hence with w = ck, the partial differential equation can be simplified to 


e -= A) At, r)=0 = +a?) A(t, k)=0. 


Since A(t, r) must be real, A*(t, k) = A(t, —k). Therefore, the solution of the dif- 
ferential equation reads 


A(k) exp(—iwt) + A*(—k) exp(+iar) 
A : 


A(t, k) = 


Here, the factor 1/2 is arbitrary (it only has to be real), but it is nevertheless useful 
for what follows, because we then have A(k) from the initial values A(O, k) = 
+ {A(k) + A*(—k)} and dA(t, k)/dt|,.0 = —Fiw {A(k) — A*(—k)} as 


3.3 Electromagnetic Field 217 


Fig. 3.19 Linearly polarized 
electromagnetic wave. The 
polarization plane (thus E) 
(red curve) lies in the plane 
of the page and B (blue 
curve) is perpendicular to it 


i dA(t, k) 


A(k) = A(O, k) + — 
w 


ot t=0 ` 


Finally, because exp{i(k - r + wt)} = (exp{i(—k - r — œt)})* (and rewriting k > 
—k), it follows that 


1 


A(t, r) = T 


J dk Re(Ak) exp{i(k - r — ot)}) 


with w = ck and 


i dA(t, r) 


i o at 


f d?r exp(—ik - r) (AO. r) + 


1 
J2n 3 0) ` 


If we restrict ourselves to one value k, then this gives the propagation direction of 
the wave in which it travels through the homogeneous (and isotropic) medium with 
velocity c = 1/,/é and wavelength à = 27 /k. 

In a non-conductor, the fields E and B are solenoidal, thus transverse: 


k-E(t,k)=0 and k-BC,k)=0. 
The vector potential is only solenoidal for the “transverse gauge” (Coulomb gauge) 
V-A=0 => k- A(t, k)=0. 


For the position and time dependence exp{i(k - r — wt)} of the fields, the equation 
—iw B(k) = —ik x E(k) follows from the induction law 0B/dt = —V x E: 


k 
cB(k) = e x E(k), with e = 7. 


For w Æ 0, the three vectors k, E(k), and B(k) thus form a right-handed rectangular 
frame, and in homogeneous insulators we need only E(k) or B(k) (see Fig. 3.19). 
However, this is not yet useful for the energy density 5 (Œ -D +H- B) and the 
energy flux density E x H, since for a bilinear expression, a double integral over 
k and k’ would have to be performed. If we average over time, then we arrive at 


218 3 Electromagnetism 
least at 5(@ + œ’) or at ô(k + k’), respectively, and if we average over space, also 


at 5(k + k’). Here the Fourier components corresponding to k and —K are related, 
because the fields are real. We consider therefore the special case with fixed k: 


E(t, r) = Re(E(k) exp{i(k - r — ot))) : 
The Maxwell equations require w = ck, k - E(k) = 0, and 
cB(t, r) = Re(ex x E(k) exp{i(k- r — wt)}) . 


Because Rez = 5 (z + z*), the expression +E* (k) - D(k) follows for the time- 
averaged value of E - D. For the mean value of H - B, we find the same, because 
the fields are transverse. The average energy density is 


w(t, r) = }E*(k) -D(k) = | H*(k) - Bik). 


Therefore, from the average energy density w, we can also determine the amplitude 


E of the field strength: 
a |2 W 
E 


This expression is needed, e.g., for the energy of interaction between a wave with 
energy fiw = w V and the dipole moment p of an atom, yielding 


W = py 2ħw/EV cos(at) . 


For the mean value of the Poynting vector, we obtain 


S(t, r) =c w(t, r) e. 


Note that the bars are often left out, but the equations are valid only for the average. For 
the velocity (S/w) of the energy flux, we thus obtain c ez, a vector of absolute value c 
in the propagation direction k of the wave. The momentum density £ uS has the same 
direction, and its absolute value is equal to w/c, from Weber’s equation £u = c7?. 
In Sect. 3.4.9, we shall also arrive at this ratio between energy and momentum for 
massless free particles. 

A further feature of electromagnetic radiation is its polarization direction. Here 
we mean the oscillation direction of the electric field—the magnetic field oscillates 
perpendicular to it, since wB(k) = k x E(k). Therefore, one of the two unit vectors 
ej and e; with ej- e; = 0 and ej x e1 = ex suffices for expansion of the field 
vectors. Then we have, e.g., 


3.3 Electromagnetic Field 219 
E(k) =e E] +e, El. 


The direction of the two unit vectors is thus not yet uniquely fixed. We are free to 
choose a preferred direction. For the example of diffraction, we take the plane of 
incidence as the preferred direction: ey lies in the plane, e, is perpendicular to it. 

The amplitudes E(k) are Fourier components of the real quantities E(t, r) and so 
have complex components £ and E. Therefore, if we set E = |E| exp(if), then 
in the plane k - r = 0, it follows that 


E(t, r) = Re{E(k) exp(—iar)} = eq |E] cos(@t— b) + e1 |E | cos(wt—B_) . 


Instead of the the two phases £j and 81, we use their difference 66 = 6B, — £y and 
their mean value B = 4 (6) + £1): 


E(,r) = {ey |E\| +e. |Ex|} cos(5 8B) cos(wt — p) 
—{e, |E\| — e1 |E_|} sin 68) sin(wt — p) . 


In general, this is an elliptically polarized wave, because a cos(wt — B) + 
b sin(wt — $) traces out an ellipse. For a œ b, we obtain a piece of a straight line (lin- 
early polarized wave) and for a = b witha L b, a circle. Therefore, for |E] = JEL] 
with 68 = $7 (modz), the wave is circularly polarized. For 88 = +37, the field 
rotates within a quarter period from the direction e to +e, . In optics, we speak of left- 
or right-circularly polarized light, depending on how the field vector rotates when we 
view against the ray direction—anticlockwise or clockwise: 66 = + ir corresponds 
to left-circular polarization. In contrast, in particle physics, we view along the ray 
direction and for 68 = +h7, we speak of positive helicity (right-handedness) and 
for 668 = — ix, we speak of negative helicity (left-handedness). 

Instead of linear polarization, we may of course expand in terms of circularly 
polarized light: 


E(k) = e} E, +e_ E_ . 


Because Re{E(k) exp(—iwt)} = ReE(k) cos(wt) + ImE(k) sin(wt), for circularly 
polarized light, ReE(k) must be perpendicular to ImE(k). This property must be 
satisfied by the vectors e+. We take complex unit vectors and set 


= ej = ie, expligs 
V2 - 


where e, is appropriate for positive helicity and e_ for negative. The phases gi 
may be chosen arbitrarily, e.g., such that the coefficients E+ are real. (Note that, in 
Sect. 5.5.1, we shall take the factor + instead of exp(ig.).) In any case, we always 
have 


w 


H 
H 


220 3 Electromagnetism 


ex*-e, = 1 and e 


and hence E+} = e,* - E(k). In addition, 


e4“ xe, = tig 


is independent of the phase factor. 


3.3.9 Reflection and Diffraction at a Plane 


We consider the boundary plane between two insulators and let a plane wave with 
wave vector kę fall onto the interface. Then there is a diffracted (transmitted) wave 
with wave vector kg, and a reflected wave with wave vector k, (Problem 3.40) (see 
Fig. 3.20). 

According to Maxwell’s equations, we have the boundary conditions (see p. 207) 


n x (Ee + E — Ey) = 0, n- (Be + B, — Ba) = 0, 
n-(D.+ D, — Da =0, n x (He + H, — Hy) = 0. 


Since these always have to hold, all three waves must have the same angular frequency 
æ, because only then will their exponential functions exp(—iwf) always agree with 


each other. Likewise, for all positions r on the interface, we must require 


ke-r =k -r= kar, 


04 


Fig. 3.20 Wave vectors Ke, kr, and kg at a beam splitter, an interface with the normal vector n, the 
unit vector t in the plane of incidence, and the angles 6¢, 6;, and 64. The three wave vectors have— 
as proven in the text—equal tangential components and ke and k, opposite normal components. 
In addition, kg/ke = Ce/cą holds, and the ratio of the indicated circular radii is thus equal to the 
refractive index n 


3.3 Electromagnetic Field 221 


since only then can the exponential functions exp(ik - r) be the same everywhere at 
the interface. If r is perpendicular to ke, then it is clearly also perpendicular to k, and 
kg: all three vectors ke, k,, and kg lie in the plane spanned by k, and n, the plane of 
incidence. If on the other hand we take a vector r along the intersecting line of the 
interface and the plane of incidence, namely the vector t, then the three wave vectors 
must have equal tangential components: 


ke sin ĝe = k, sin 0, = kg sin Og . 


Now because w = ck, we also have k; = ke and ca ka = Ce ke, and therefore, 


. f sine Ce Ed Ha 
sin ĝe = sin 0; and - = = =n, 
sin 0g Ca Ee [Le 


which is the Snellius diffraction law (see Fig. 3.20). The ratio ce/cq of the velocities 
is the refractive index n. One should not take the static values—the material constants 
depend upon the frequency (dispersion). 

After the relations between the wave vectors, we now investigate those between 
the field amplitudes. To this end, it is useful to express all fields in terms of E(k) 
because, for linearly polarized light, the oscillation direction of the electric field is 
defined as the polarization direction: 


D=cE, B=e,xE/c, H= e, x E/uc. 


The set of boundary conditions provides a system of equations. In order to solve 
these, we introduce the two unit vectors t and b = t x n in addition to the normal 
vector n (b in Fig. 3.20 points toward the observer). With k =t t-k+n n- kand 
using the Snellius diffraction law, we find 


t- ke = +ke sin 6e = + t-k,, t- ka = +kasin 0a , 
n - ke = —ke cos Qe = —n-k,, n- kg = —ka cos 6a . 


If we decompose these three E vectors into their perpendicularly polarized com- 
ponents E; = E -b (perpendicular to the plane of incidence) and their parallel 
polarized components Ej} = E - (b x ex) (in the plane of incidence), 


E=b£E,+bxe Ej, 


then, because k x E =k x b E; +bk Ej and k xb= -~-n t-k+t n-k, we 
have 


n. E = t-e Ej ; 
n- (kxE)= —t-k£E,, 
n x (k x E) = t kE -b n-kE£E,, 


n x E =b n-& Ej +E. 


222 3 Electromagnetism 


Hence the boundary conditions for the normal components yield 


sin 02 sin ĝa 
(Bet F E;1) = 
Ce Cd 


Ee Sin ĝe (Ee + Ery) = £a sin Og Ean, Eal, 


which are already contained in the requirements for the tangential components, if 
we take into account the Snellius diffraction law sin ĝe : sin ĝa = Ce : Ca and Weber’s 
equation: 


cos be (Eej — Ery) = cos 8a Eay , Ea tEn = Ea, 


cos be cos ĝa 
(Esait Emn = — Eji (EsL— En) = Ea- 


He Ce Ha Ca He Ce Ha Ca 


Therefore, with 


1 Me Ce He 
n—= 


Ha Ca Ha 


—in insulators, in particular, u œ% uo and hence n’ ~ n (thus n © ./eq/€e)—we 
obtain 


Eq n’ cos Ôe — COS Og Ex, COS ĝe — n’ cos Og 
Ee n’ cos 8e + cos 6g ° Ee, cos. +n’ cos ôa 
E 1 E E E: 

di _ (1+ DE C rh 
Eey n Eej Ee, Eel 


For the corresponding equations for the magnetic field strength B, the factor n is 
included in the lower row, because E and B differ by the velocity c. Note in addition 
that B oscillates in a direction perpendicular to E. Clearly, for perpendicular incidence 
and n’ = 1, nothing is reflected, hence if the wave resistance cu = y/u /£ remains 
the same (the value for the vacuum is approximately 37792, according to p. 165). 

If, after the approximation n’ ~ n, we use the diffraction law sin@, = n sin ĝa, 
Fresnel’s equations follow (see Fig. 3.21): 


Er _ tan(@, — 0) E E sin(@, — 0a) 
Eea  tan(@.+ 8a) ” Ee, sin(O. + 8a) ” 
Ea 1 Eat Ear iia Er 

Eej T cos (6e _ 6a) Fe. , Fe. z Ee, ` 


Because 


sina cos $ + cos g sin B 


tan(a + B) = ae 
cosa cos f F sina sin B 


and cos? w + sin? æ = 1, we have 


3.3 Electromagnetic Field 223 


E,/Ee E,/Ee Re E,/Ee 
1.0 


0.5 


0.0 ôe 0.0 
ete sco: 


-0.5 e -0.5 


60 


-1.0 -1.0 


Fig. 3.21 Fresnel’s equations for the transition from air to glass (n = 3/2) and back. Brewster 
angle o. Limiting angle for total reflection e (for larger angles, only ReE,/Ee is shown) 


tan(a—B)  sinacosa — sin £ cos B 
tan(a+B)  sinacosa + sin B cos B ` 


Part of the result could have been obtained without the calculation above. If the 
transmitted field strength oscillates in the direction of k,, then the reflected component 
E is missing, i.e., Ey; = 0 for k, L ka or 0a = 90° — 6, = 90° — &. Since n = 
sin 6, / sin 6g, the Brewster angle is found to be 


Oe = arctan n , 


the reflected wave is linearly polarized, so E oscillates only perpendicularly to the 
plane of incidence. Note that, without the approximation n’ ~ n, the Brewster angle 
is found to be arctan (ny/(n’? — 1)/(n? — 1)). 

As a function of the angle of incidence and the refractive index in the approxima- 
tion n’ ~ n, it follows that 


Eq n? cos 6. — yn? — sin? be E Cos — n2 — sin? Qe 
Ee n2 cos. + yn? — sin? 0. Ee cos0. + Vn? — sin? 6, 
E 1 E E E 

al (1+ T ad yy E, 
Eel n Eel Eel Eel 


where we have used cos 6g = v1 — sin? 64 = v1 — n~? sin? ĝe. Forn < 1, there is a 
limiting angle for total reflection, viz., 9. = arcsin n. For higher angles of incidence, 
the amplitude ratio E,/E, is complex (of absolute value 1) and the refractive index 
likewise. Linearly polarized radiation then becomes elliptically polarized, and the 
transmitted solution is damped. We shall not discuss this here, because we shall deal 
with damped solutions (in space) in the next section anyway. We sometimes speak 
of evanescent waves. 


224 3 Electromagnetism 


3.3.10 Propagation of Waves in Conductors 
In contrast to the last two sections, we shall no longer restrict ourselves to o = 0. 
Then, 


0B dD 
oe a re V-B=0, V-D=0, Mop obs 


Here, electromagnetic energy is converted into heat and hence, for a homogeneous 
medium, the wave equations gain a damping term 


[a-n(ot+er)olE=0, V-E=0, 


and likewise with B instead of E. These are the telegraph equations. 

If an external wave impinges on a conductor surface, then the fields depend peri- 
odically on time. We have to investigate the position dependence in the conductor. 
According to the telegraph equation, the ansatz 


E(t, r) = Re(Ek’) exp{i(k’ -r — wt)}) 

for all positions in the conductor leads to the condition 

kK? = eu o? (i+i=) 

EW 
This can be satisfied for real only with a complex wave vector. A complex permit- 
tivity ¢ (1 + io /£w) is also often introduced. Here, for a scalar material with constant 
o, £, and u, the real and imaginary parts of the wave vector have the same direc- 
tion. The new feature in comparison with non-conductors is longitudinal damping. 
Therefore, we set 
k’=(@+if)k, 

where as before ck = w with c = 1/,/eu. Then we have 


exp{i(k’ -r — wt)} = exp(—Bk-r) exp{i(« k -r — wt)} 


and (œ + if)? = 1 + io /ew, whence 


a= fiw +(c/eo? +} and B= Gv + (o/ew)? — 5. 


Now, with increasing k - r , the amplitude decreases. The wave is damped spatially. 
Since conductors usually have o/¢@ >> 1, whereupon the electric current is large 
compared to the displacement current, we obtain the decay length 


3.3 Electromagnetic Field 225 


Insulator Conductor 


E, H 


Fig. 3.22 Repulsion of the current. Decay of the alternating fields in the interior of a conductor— 
dashed lines show their amplitude—here for o >> ew and hence a ~ p. (When o/ew < oo, there 
is also a normal component of the magnetic field and a tangential component of the electric field) 


d= 1 _ 1 /2ew 2 
~ Bk k o Youw’ 


where the amplitude for perpendicular incidence is smaller than the factor 1/e at the 
surface. High-frequency alternating currents are thus repelled from the interior of the 
conductor, flowing only at the surface. This is referred to as repulsion of the current 
or the skin effect (see Fig. 3.22). The higher the conductivity, the shorter the decay 
length. For the phase velocity, we have c' = w/ak = c/a, and for o/ew > 1, we 
thus have œ ~ 6B > 1, whence also c! © c/B = wd and therefore c’ < c. 

Since 


k’-E(k’)=0, oB(k’))=k’x E(k’), and k’-B(k’)=0, 


the three (complex) vectors k’ = (a +i8)k, E(k’) and B(k’) are once again per- 
pendicular to each other and still form a right-handed frame, but E and B differ in 
phase and therefore no longer have the same nodes. If, as in Sect. 3.3.8, we average 
over the time, we obtain 


HG, rn BG, r) 


+ H*(k’)- B(k’) exp(—26kK- r) 


VIF (c/ew)2 EG, r)- DG, T), 


where the square-root factor originates from k”* - k'/k*. For most conductors, there 
is much more energy in the magnetic field than in the electric field. Here now the 
energy density decreases with increasing distance from the surface, in proportion to 
exp(—2£ k - r) (Problem 3.41). 

If a conductor is adjacent to an insulator, and if n points from the conductor to 
the insulator, then we have the boundary conditions 


n x (E;— Ec) = 0, n- (B; — Bc) = 0, 
n- (Di — Dc) = p4 , n x (Hı — Hc) = j4 - 


226 3 Electromagnetism 


The fields do not enter an ideal conductor at all—it is fully screened by charges 
and currents on the surface (Ec = ... = 0). Therefore, the electric field lines end up 
perpendicular to the surface of an ideal conductor (without tangential component, 
i.e., Er = 0), and the magnetic fields adapt to the surface (without normal compo- 
nent, i.e., Hy = 0). But if the conductivity is finite (normal conductor), a current is 
accompanied by a finite field in the current direction (Er 4 0), and there is no surface 
current density. Therefore the tangential component of Hc turns continuously into 
that of H; and decays exponentially in the conductor (for œw ~ 0) with increasing 
distance from the surface. 


3.3.11 Summary: Maxwell’s Equations 


Two new quantities lead from statics to time-dependent phenomena: charge conser- 
vation (continuity equation) supplies Maxwell’s displacement current 0D/dr, and 
Faraday’s induction law connects 0B/dt with V x E, where the sign results in Lenz’s 
rule. The induction field counteracts the change in the magnetic field. Hence we have 
the basic Maxwell equations: 


oB 
MAES Sgr V-B=0, 


t 
. oD 
V-D= pop, VAH = jt 


These differential equations correspond to integral equations, 


d 
i d-E= -F f af-B, 1 df-B=0, 
(A) dt Ja v) 


d 
f ap=o. [ane re Sf atv, 
v) (A) dt Ja 


and boundary conditions, 


nx (E,—E_) =0, n- (B,—B_) = 0, 
n-(D;—D_) = pa, n x (H,—H_) = jy. 


Taking Fourier transforms with exp{i(k - r — wt)}, the four Maxwell equations read 


k x E(a,k) = wBi,k), k - B(w, k) = 0, 
k - D(w, k) = —i p(w, k), k x H(w, k) = —ij (w, k) — oD (w, k). 


In charge-free, homogeneous space, they lead to transverse waves, and they obey the 
telegraph equation, which is the same for E and B. Here the three vectors k, E, and 
B are pairwise perpendicular to each other. 


3.3 Electromagnetic Field 227 


The time-dependent potentials ® (t, r) and A(t, r) are useful: 


E = —V® — — and B=VxA. 


Then the first two Maxwell equations are automatically satisfied. However, the scalar 
potential ® is determined only up to an additive term 0 W/dr, and the vector potential 
A only up to its sources—it would have to be changed by — V Y. The potentials may 
still be gauged to our advantage. Here Y or V - A is fixed. For the Coulomb gauge, we 
choose V - A = 0, and for the Lorentz gauge, V -A = —eu 0®/0t. In both cases, 
the resulting system of equations is decoupled. 


3.4 Lorentz Invariance 


3.4.1 Velocity of Light in Vacuum 


In contrast to the situation in mechanics, in electromagnetism a specific velocity is 
picked out, even if there is no matter in space which could supply a reference frame. 
This velocity is the velocity of light in vacuum, viz., 


m 
co = 299792 458 — . 
s 


But in electromagnetism, no inertial system is special, because the four Maxwell 
equations are valid in all uniformly moving reference frames. In particular, the veloc- 
ity of light in vacuum is the same in all inertial frames. 

Due to this astonishing fact, we have to completely rethink the notion of velocity, 
and thus also the measurement of lengths and times. In particular, we need a signal 
velocity co in order to fix equal times everywhere in space (coordinate system). In 
order to synchronize clocks at two points with constant separation |r — r'|, we send 
a signal from one point and expect it to arrive at the other point at the time At = 
|r —r’|/co. Without a signal velocity, we cannot synchronize clocks at different 
positions, and without clocks we cannot measure a velocity. The fastest velocity is 
that of light, a million times faster than sound in air. Therefore, we synchronize our 
clocks with light signals. (If there were some kind of action at a distance, with infinite 
propagation velocity, then of course we would use that to synchronize our clocks.) 

Since co is the same in all inertial frames, we may not start from a generally fixed 
(absolute) time, as we would in classical mechanics. There it is assumed that, for 
two inertial frames moving relative to one another, only the position coordinates 
transform, but not the time. That implies the validity of the 


Galilean transformation. t'=t, r'’=r—vt. 


228 3 Electromagnetism 


But this can be valid only for v < co, because it does not contain the velocity of 
light in empty space. 


3.4.2 Lorentz Transformation 


We consider an inertial frame with unprimed coordinates (t, r) and one with primed 
coordinates (t’, r’), moving uniformly with velocity v relative to the first, where the 
position vectors are given in Cartesian coordinates. We restrict ourselves to homoge- 
neous Lorentz transformations: the origins (0, 0) of the two systems agree with each 
other. (Inhomogeneous Lorentz transformations contain four further parameters, 
since for them the zero point is also moved, and they form the Poincaré group.) Since 
otherwise no event is preferred, the two coordinate systems depend linearly on each 
other (via a real transformation matrix). The transition is reversible, and therefore 
their determinant must be either positive (a proper Lorentz transformation, continu- 
ously connected to the identity) or negative (improper Lorentz transformation, e.g., 
space reflection, also called the parity operation, t' = t,r' = —r, or time reversal, 
t' = —t,r’ = r). If we include these two improper Lorentz transformations with the 
proper ones, then we obtain the extended Lorentz group. If the past remains behind 
and the future ahead, then the Lorentz transformation is orthochronous (dt' /dt > 0). 

For infinitesimal Lorentz transformations, the matrix is barely different from the 
unit matrix, so no squared terms in this difference for (cot)? — r? = (cot)? — r° 
need be accounted for. The additional terms form a skew-symmetric matrix with six 
(real) independent elements and lead for finite Lorentz transformations to six free 
parameters: three Euler angles and three parameters for the boost. 

For the time being, we choose the axes such that v has only an x-component 
(> 0). Then y = y’ and z = 2’, and only (t, x) and (t', x’) depend on each other in a 
more involved way. At least in the two coordinate systems, the relative velocity will 
be denoted by v = —v’. Therefore, we require 


x =y(x—vt) and x=y(x'+ut?'), 


because the point x’ = 0 moves away with velocity v = x/t and the point x = 0 
with the opposite velocity —v = x’/t'. The factor y must be the same in the two 
equations, otherwise the two reference frames would differ fundamentally from one 
another. We determine y from the requirement that, in the two systems, the same 
velocity of light co must result. Then we have 


coAt = Ax = y (Ax' + vAt'), as wellas coAt = Ax’ = y (Ax — vAt). 


We must therefore have coAt = y (co + v) At’ and coAt’ = y (co — v) At, and 
hence, 


3.4 Lorentz Invariance 229 


y=1/V1-8? qi= 


0 0,0 
0,0 0,5 10 8 0,0 0,5 LO £ 


Fig. 3.23 Relations between the parameters £, y, and y—!. The dashed line is the relation for the 
Galilean transformation 


At v 1 
ms (co + v) = co = ye 
At’ Co y (co — v) 1 — (v/co)? 


We therefore use the abbreviation 


v 1 


co J1— B- 


Since the coordinates remain real, 6 < 1 must hold, so v < cp and y > 1 (see 
Fig. 3.23). 

From t’ = (x/y — x’)/v and x'/v = y (x/v — t), it follows that tr’ = (y7! — 
y)x/v+ yt. Here, 1 — y~? = B*, so t' = y (t — Bx/co). If we combine x’ = 
y (x — vt) y'= y and z’ = z as a vector equation, we obtain finally the Lorentz 
transformation 


. —1 

r =y (t- &) and r'=r+ pB r-ypot. 

co B 
Conversely, because 8’ = —B, we have 
/ 

. —1 

r=y (r+ L) and rer + BB-r'+yBcot'. 
Co 


In the limit of small velocities v < co, whence 6 < 1 and y © 1, we arrive at the 
above-mentioned Galilean transformation 


t=t, r’=r-—vt, or t=?t', r=r’+vt. 
But this holds only approximately because of the finite signal velocity co. Therefore, 


from now on, we shall only deal with the Lorentz transformation. In particular, for 
v & e, we have 


230 3 Electromagnetism 


(= VDE). B= FC). 


along with y’ = y and z’ = z. With the consequences 


Ax Ax’ 
Ar =y(at-6 Z), at =y (ar +8 ©), 

co or co 
Ax’ = y (Ax — v At), Ax = y (Ax’ +v At’), 


we can compare rulers and clocks in reference frames moving relative to each other 
and derive two noteworthy phenomena. 

The first is Lorentz contraction: the ends of a ruler of length Ax in its rest system 
must be measured simultaneously in the moving system and are found to be closer 
together: 


ae , Ax 
At =0: Ax = — <Ax. 
Y 


Conversely, thanks to the requirement At = 0, the length Ax’ in the oppositely mov- 
ing (unprimed) system is also shorter, i.e., Ax = Ax’/y. Moving lengths are shorter 
than the proper length in the rest system by the factor 1/y = y1 — 8? < 1. In addi- 
tion to Lorentz contraction, owing to the finite light propagation time, a dilation by 
the factor 1/(1 — £) also occurs when frames approach one another and a compres- 
sion by 1/(1 + 8) when they move apart. The total factor VI £ B/./1 F £ is also 
shown in the middle of Fig. 3.26 (and see also Table 3.1, although reversed there, 
since frequencies are inversely proportional to wavelengths). 

The second striking phenomenon is relativistic time dilation: times must be com- 
pared at the position of the clock in the rest system, and result in times in the moving 
system being dilated by the factor y > 1 compared with the proper time (in the rest 
system): 


Ax=0: At =y At>At, or Ax =0: At=y Ar. 


This effect must be included when determining the lifetimes of fast-moving particles: 
for v © co, the factor y is significantly greater than 1. 

The two phenomena can also be read off from the Minkowski diagram (Fig. 3.24). 
But it is worth making a few comments. The quantity (cot)? — x? is a Lorentz 
invariant: (cot)? = y? (cot — px)? and x” = y? (x — Beot)? imply (cot)? — x? = 
y? {(cot)* — x7) d- B’) with 1 — 6? = y? Therefore, for a Lorentz transforma- 
tion the world points (cot, x) in the Minkowski diagram move on a hyperbola, and 
for cof = x, on the associated asymptote. We distinguish here the time-like region 
with |cot| > |x| and the space-like region with |cot| < |x| (actually, we should write 
|r | instead of |x|). The surface |cot| = |r | is called the light cone. Time-like world 
points on the hyperbola (cot)? — x? = C? > 0 then obey the parameter representa- 
tion cot = Ccosh@¢, x = C sinh @. (For a space-like world point, cot is exchanged 


3.4 Lorentz Invariance 231 


Fig. 3.24 Minkowski diagram. It has a spatial coordinate and the (reduced) time cot as axes and 
the light cone |cot| = |r| as diagonal. A moving coordinate system is also shown. Its axes have 
slopes of 6 or B—!. The scale transformation is indicated by the hyperbolic curves—they connect 
world points at equal positions (blue curves) or times (red curves). The two arrows at bottom right 
indicate the length contraction, those top left the time dilation: they point from the unit coordinate 
value in the rest systems to the axes of the moving systems, each parallel to the axes 


with x.) With œ = arctanhf and ¢’ = ¢ — a, the above-mentioned Lorentz transfor- 
mation, i.e., the transition to oblique space-time coordinates, is then simply 


cot = Ccosh¢, cot’ = Ccosh@’ , 
x = Csinhd, BE x’ = Csinh¢@’ , 


if we employ the addition theorems for hyperbolic functions, namely, the relation 


cosh (p—a) = cosh ġ cosh æ — sinh ¢ sinh æ (with the special case 1 = cosh? a — 
sinh? œ) and sinh (@—a@) = sinh @ cosha — cosh ġ sinha. 


3.4.3 Four-Vectors 


The Lorentz transformation connects space and time and mixes their coordinates. 
Therefore, instead of the usual three-vectors in the normal space, we now take four- 
vectors in space and time. In order to have all four components as lengths, we use 
the path length cot of the light instead of the time, and take it as zeroth component: 


Gc) S (x, x*) Sv? x! x, x°) = (cot, x, y, z) = (cot, r) . 


We let Greek indices (e.g., 2) run from 0 to 3, Latin indices (e.g., k) from 1 to 3. 


232 3 Electromagnetism 


As in Sect. 1.2.2, we also distinguish in four dimensions between covariant and 
contravariant vector components with different transformation behavior. Then for a 
Lorentz transformation, we have 


In the following we would like always to sum over doubly appearing indices from 
0 to 3 if in an expression each occurs once as a subscript and once as a superscript 
(Einstein summation convention). In this way we avoid the bothersome notation of 
the summation sign—we often have to contract tensors and we have already used 
this idea to abbreviate the scalar product in vector algebra. With this and according 
to p. 33, we find 


ax’! . 
A' = A” for contravariant vector components, 
ox” 
; ax” ; 
A’, = Av Fou for covariant vector components. 
x 


We may also read these two equations as matrix equations—they are linear trans- 
formations with symmetric transformation matrices which are inverse to each other. 
For the example considered in the last section, they read 


l y By 0 0 y By 0 0 
ax"). |y y 0 0 ax” \~ | By y 0 O 
(=) 0 010 and (=) =! 0 0 1 0 
0 001 0001 


If the coordinate axes are not as well adjusted to the relative velocity as here, then of 
course not so many matrix elements a”, will vanish. Generally, with x’"x',, = x’x,, 
we always have 
aia =g, and a”, =a". 
The special case here suffices for the general principle. 
The transition between covariant and contravariant components is, however, not 
as simple as for Cartesian coordinates. In particular, the velocity of light must have 


the same value in all coordinate systems, so the Lorentz invariant (codt)? — dr - dr 
must be a scalar: 


dx, dx" = co” dt? — dr - dr . 


This is achieved by introducing the Minkowski metric 


3.4 Lorentz Invariance 233 


This matrix is sometimes written also as diag (1, —1, —1, —1). On p. 32, we have 
gii = g; -g; > Oandalso g" > 0. This suggests introducing imaginary base vectors, 
but we shall not consider these further here, since we only need the metric to inter- 
change upper and lower indices. To this end, we always take the above-mentioned 
fundamental tensor. Thus, since x, = g, x", we have 


(xu) = (xo, Xk) = (xo, X1, X2, x3) _ (cot, =X, —y, —Z) . 


This suggests choosing the fundamental tensor with opposite sign, since then the 
space components remain unchanged for the transition from three to four dimensions. 
(It is also common to set x* = icot and drop x°.) But then physically sensible scalar 
products like p„ p“ become negative. For this reason, we prefer, like many other 
authors, to stick to the choice just made. 

For the transition from three to four dimensions, however, we encounter some 
difficulties with the concept of the vector product. In particular, a x b should be 
perpendicular to a and b, but this is unique to three dimensions. In four dimensions, 
we may no longer refer to “axial vectors” as vectors. But if we take over the usual 
components of a vector product, then it transforms according to (with Latin indices 
running from | to 3) 


a" b! _ qa! b" = 


thus as a tensor of second rank (see p. 35 and Problem 2.4). It is skew-symmetric: 


TS Th a STS ST SS Hri 
In three dimensions, such a tensor has three independent components T? = —T?!, 
T? = —T*? and T?! = —T!3, while T! = T? = T? = 0. In the following, we 


shall also consider four-dimensional skew-symmetric tensors of second rank, with 
six independent components. They have the properties 


TË? = T" = -T° = +T: = -To = +T'o = Tio = +T; , 


as follows immediately from the metric. 
In addition, the circulation density of a vector field is a skew-symmetric tensor of 
second rank: 


Əxi 


234 3 Electromagnetism 


Note that, because of the derivatives, all indices are taken to be covariant. For 
variable base vectors, the derivative of a vector component a; with respect to xf 
arises, and then also a - dg; /dx'. We thus have to introduce the Christoffel sym- 
bols of Sect. 1.2.6, although for rotations, these contributions cancel: dg ; / dx! = 
°r/(dx'dx/) = dg; /dx/. However, for space-time considerations, we now restrict 
ourselves to fixed base vectors anyway. 


3.4.4 Examples of Four-Vectors 


As a first example, we have already met the four-vector 
(x) = (cot,r) <=> — (Xu) = (cot, =r). 


If we want to build the velocity vector (v), we cannot simply differentiate with 
respect to time, since that would not be Lorentz invariant—we must differentiate with 
respect to the proper time t (see p. 230). We have dt = y dt, ord/dt = y d/dt, and 
hence 

wy Ey (cov) = (yy) Fy (coy) and vuv =e’. 
Thus only in the non-relativistic limit v < co do we arrive at the usual notion of 
velocity, for then y ~% 1. We can also derive this in a different way. Corresponding 


to velocity 0, we have the four-vector (v“) = (co, 0, 0, 0). If it undergoes a Lorentz 
transformation with the velocity —v in the x-direction, or 


then since the matrix is symmetric, we may multiply it by a row vector from the left 
or a column vector from the right to obtain the four-vector (v’“) = y (co, v, 0, 0), 
and thus the same result as before. 

This second idea allows us to derive the addition law for velocities. If the 
above-mentioned matrix acts on the four-vector with parallel velocity vectors, 
Yo (Co, Vo, 0, 0), it follows that 


TEA v + vo 
(v) = y yo (1+ Bbo) (co, IF BBo’ 0, 0) ; 


and if it acts on the perpendicular velocity vector yo (co, 0, vo, 0), we find 


3.4 Lorentz Invariance 235 


Fig. 3.25 With the velocity 8=v/co 
parameter w = arctanhf, 
known as the rapidity, the 1.0 
addition law for parallel 
velocities reads 

0 + PI : 
B 1+ Bobi then simply ag 
w = wọ + wv). For |p| < 1, 


we have w ~% B 


0.0 


9 vo 
wL”) = yy (co, v, S0) ' 


Here, v and vo are thus not equivalent: Lorentz transformations do not in general 
commute. The factors yy (1 + Bo) or y yo are indeed the same as the quantity 


y’ = (1 — B”)~'/”, as we shall now prove by showing that 8? = 1 — y’~: 
(aai d= =A _« 1 
1+ BBo (1 + BBo)? y? yo? (1 + BBo)? ° 


and 


2 Bor 2: 2 
B +- =1- U= $= bo^). 
Y 


Incidentally, this is also equal to By” + B?yp~?. The addition law can be summarized 
by 


i 1 ( Vo y-lv- w 
v= 
1 + V- Vo/co? Y y vw 


This equation also follows from dr’/dt’ = dr’/dt - dt/dt’ with the formulae for the 
Lorentz transformation on p. 229, if dr/dt = Vo is used (see Fig. 3.25). 

Only if all velocities are small compared to co do we have v’ = v + Vo. Otherwise 
the velocity of light in vacuum could also be exceeded, but in fact, v’ = co if v or vo 
is equal to co. For parallel velocities, this follows immediately from 


v' = (v + vo)/(1 + Bbo) , 


and for perpendicular velocities, 


v? = v + vwo /y? = co (B + Bo” — BoB’). 


236 3 Electromagnetism 


Table 3.1 Longitudinal and transverse Doppler effect 


6’ 0 50 + arcsin f x 


When £ = 1 or fo = 1, the bracket is equal to 1. 

If a medium with refractive index n = co/c moves with velocity v and there is 
light travelling in it in the same direction, the velocity of this light will depend on 
this reference system (Fizeau experiment on the drag of light in moving bodies): 


1 1 
TEA = Mm et (1-5 jere 
1+ B/n 1+ B/n n? 


The expression in brackets is called the (Fresnel) drag coefficient. However, for 
dispersion, in addition to —n~?, it also contains the term (w/n) dn/do. 

The zero of a wave is determined by the phase wt — k-r and must not depend 
upon the choice of coordinates. The expression must be a Lorentz invariant and must 
therefore be written in the form of a scalar product k“x,,. Consequently, we have 


(kY) 2 (=. k) , with kk, =0 (because w = cok) . 
co 


With t = y (t +v-r’/co*) andr = r' + {(y — lv? v-r' + y t'} v (see p. 229), 
and comparing coefficients in wt — k - r = w’t' — k’ - r’, we deduce that 


1 1 y-l1 yk 
o' =y (@—v-k) and k'=k+( —v-k )y. 
v Co 


With w = cok, this implies the Doppler effect for the frequency, viz., 
œ = wy (1— Boosé@), 


where 0 is the angle between v and k. Thus the Doppler effect with vA = cg yields 
the wavelength A’ = å /{y (1— p cos 6)}. Some example applications are given in 
Table 3.1 and Figs. 3.26 and 3.27. 

With the factor y, a transverse and a quadratic Doppler effect occur (this does 
not of course hold for the propagation of sound, as the velocity of sound is so much 
smaller than the velocity of light). In addition, the propagation direction is described 
differently (aberration). With the vectors 


3.4 Lorentz Invariance 237 


w/w A'A g! 
180 


90 


0 0 
0 90 180° @ 0 90 180° 0 


Fig. 3.26 Angular dependence of the frequency, wavelength, and deviation. In the left and right 
figures, straight lines refer to B = 0 (black), the curves to B = i (red), 5 (blue), and 3 (green). 
The middle picture shows the ratio of the wavelengths 4’ /A in a polar diagram, namely the focal 


representation of an ellipse with semi-axes y and 1, and hence eccentricity 6 (here 1/2). See also 
Fig. 3.31 


0 0,0 
B 1,0 0,5 0,0 0,0 0,5 10 Ø 


Fig. 3.27 Doppler effect. A frequency depends on how fast the detector moves relative to the emitter. 


Left: Decreasing distance. Right: Increasing distance. The linear Doppler effect is indicated by the 
dashed line 


and using k’ = w'/co = y (k — B - k) = yk (1 — B - e), we deduce that 

1 | y-1 
= — e+ 
E 


1 


B-e-yv)B}, 


and thus also £ -e’ = (B - e — B”)/(1 — B - e). Here B -e’ = B cos 0’, so 


gi => A e 
1 — Bcosé y (cos 6 — B) 


With increasing ||, the difference between 6 and 0’ increases, although not for 6 = 0 
and x (see Table 3.1 and Fig. 3.26). The motion of the Earth about the Sun produces 
an aberration of starlight < 20.5”. 

For the concept of a density, it is important to note that the three-dimensional 
volume element dV = dx dy dz is not invariant, because of the Lorentz contraction, 


238 3 Electromagnetism 


while the rest volume dVo = y dV is. The charge does not change. With this, we 
then have o dV = pọ dVo and 


P=Y po. 
From the charge and current density, we build a four-vector 
Gi") = po(v") = poy (co, V) = (cop, j), with juj” = (copo) . 


In particular, j ° = cop and j = pv as before, but p depends on the velocity through 
y, i.e., through the Lorentz contraction. 


3.4.5 Conservation Laws 
In the following, we use the usual abbreviation 


ð ð 
ð = — and of = —. 
ox" OX, 
Clearly, the components ð, y transform covariantly and 0“w contravariantly. We 
now prove the following theorem: If the four-dimensional source density ð, j” = 
of a four-vector vanishes everywhere and if this vector differs from zero only in a 
finite region of the three-dimensional space, then f dV j 0 is constant for all times. 
For the proof we extend Gauss’s theorem to four dimensions: 


[ate aun = fas. i", 


where d*x = codt dx d y dz = codt dV and dS, denotes a three-dimensional surface 
element for constant x”. Its sign (direction) is fixed in such a way that it is positive if 
its x” value is greater than in the considered volume (negative otherwise). Now we 
choose the surfaces S1”, S2”, and $3 for large |x!J, |x?|, and |x| such that j = 0 
holds there. Figure 3.28 supplies the rest of the proof. 

An important application is the continuity equation: 


atV-j= < aj"=0. 


The theorem supplies 


favie=[ovep=ao 


3.4 Lorentz Invariance 239 


Fig. 3.28 j is restricted in finite space (the cylinder). ð, j“ = Othen yields f dSo j° + f dSo! j” = 
0 (the circular face Sg is covered here and therefore not indicated). Due to the directional sense of 
the surface elements, the conservation law f dv jo = f dv’ j” follows 


as a conserved quantity. The law of charge conservation follows from the continuity 
equation, and conversely, the continuity equation follows from charge conservation, 
something we already obtained in Sect. 3.2.1. 


3.4.6 Covariance of the Microscopic Maxwell Equations 


On p. 210, we found the microscopic Maxwell equations, viz., 


1 0® 

— 2 +vV-A=0, 

co? ot ji 

1 0 p 1 0 
——-A)o==, a (SE AA = moj, 
a or? E0 an a or? ) a 


with the help of the potentials ® and A in the Lorentz gauge (the Coulomb gauge 
V -A = Ois not Lorentz invariant). With the first equation, we combine the scalar 
and vector potentials to yield the following four-potential: 


(a) = (=. A) = = a,A"=0. 


Note that the equation 3, A” = 0 does not result in a conservation law, since A“ 
does not vanish sufficiently fast for large distances. In addition, using the other two 
equations, we generalize the Laplace operator to the d’Alembert operator (quabla) 


1 a? 3? 
A= = 0,0" . 
cå 3t? Ox" Ixy K 


This is a Lorentz invariant, often taken with the opposite sign, in particular, if the other 
metric is used, with g; = +ô;x. If in addition to ® = coA?, we also take into account 


240 3 Electromagnetism 


P = co! j° and (by Weber’s equation) co Eo! = Ho, then the above-mentioned 
inhomogeneous wave equations can be brought into the covariant form 


A“ = moj“, with 3A” =0. 


In four-notation, the gauge transformation ®’ = Ọ + 0W/dt, A’ = A — VY 
reads 
AY = A" +9", 


because dy = 0° and a = —d*, in addition to A? = @/co. 


With B = V x A and E = —V® — ðA /ðt, noting that A, = —A* and Ay = A®, 
we clearly have 


pt i a@7A24 7A? = =< Ag+ OA. 


E,=- 2 — = co (ð! A? — VA!) = co (oA; — 31 A0) , 


and correspondingly for the other two components of B and E. According to the 
last two columns, E/co and B can be combined in the form of a four-dimensional 
skew-symmetric tensor of second rank, the electromagnetic field tensor 


FH = 3A” — 3At =F" , 


or equivalently, Fuy = 0,Ay — 3p Au = — Fon: 


0 E,/co Ey/co E-/co 


(F*’) = +E,/co 0 —B, By 
+E, /co B, 0 —B, 

+E,/co —By B, 0 

and 
0 +E,./co +Ey/co +E, /co 

(F, )= — E, /co 0 —B, By 

EXD —E,/co B: 0 —B, 

—E,/co —By B, 0 


Unfortunately, the field tensor is not commonly denoted by B, rather than F, even 
though B is extended into four dimensions. For the extension of j to j” and A to 
A”, we are led by the space-like components, and likewise in the next section for the 
extension of M to M“”. However, the field tensor is usually also amended with the 
factor co. Then it has the components of E and coB as elements. 

Mixed derivatives commute with each other, if they are continuous. The Jacobi 
identity 0*(0" A” — 3” A“) + 04(a" A* — 3% A”) + 0°(0* AH — Ə! A`) = 0 yields 


3.4 Lorentz Invariance 241 
BFH ae aa +PP =O. 


So far we have used two Maxwell equations, namely V - B = 0 and 0B/dt+ V x 
E = 0, that is, precisely the two for which we have been able to introduce potentials. 
The other two microscopic Maxwell equations V - E = p/é9 and V x B = uo (j + 
£o 0E/ðt) can be combined if LA” = uoj” and ð, A” = 0 hold, to give 


ð  F”” =a, (ƏH A” — 9” A”) = DA” — "8, A” = mo j” . 


Hence, we have uo, j” = 0,0, F””. The continuity equation 3, j” = 0 now follows 
immediately from the antisymmetry of the field tensor, because F“” = —F’", but 
00, = +0,,0,, thus 0,0, FH” = —0,0)F"" = —0,0,F"". 

According to p. 100, the interaction density is equal to p ® — j - A, which is the 
Lorentz invariant j“A, in four-dimensional notation. Hence, we may also write 
0,F"" = uoj” as a generalized Euler-Lagrange equation, if we introduce the 
Lagrange density 


as a function of the A, and their derivatives 0,,A,. Using 


aL IL OF, =F“ Fee 
= -= BKS — 838) = — 
0(0,, Ay) ð Fer 0(0, Av) 20 Ho 


and ð p F” = uoj” = — uo 9-2 /3 Av, we obtain the differential equation 


i aL ƏL 
Ax" (ðA) 3A, 


for the Lagrange density -%. This equation apparently generalizes the Euler- 
Lagrange equation in Sect. 2.3.3, viz., 

d ( OL ) ðL 
dt \axk/  ðxk 


where the time is no longer preferred over the space coordinates. Note that $F MY Fy 
=B-B-E-E/c)’. 


3.4.7 Covariance of the Macroscopic Maxwell Equations 


If we wish to use only macroscopically measurable notions, then instead of V - E = 
p/eoand V x B = uo (j + £o 9E/ðt), orindeed 3, F*” = uo j” we now have to take 
the Maxwell equations V -D = p and V x H = j + əD/ðt, i.e., in four-notation 


242 3 Electromagnetism 
3G” =)”, 
with the skew-symmetric tensor 
0 —coDx —coDy —co D; 
coD, 0 —H, Ay 


coD, “He à 0 -m |’ 
coD, -H H, 0 


("y= 


which is the four-dimensional extension of the vectors H, just as F“” is that of B, 
and the four-vector of the average current density 


G“) = (cop, j)- 


In doing this, we also generalize D = e9E+P, thus E/co = uoco (D—P) and B = 
uo (H+M) to 


FH = po (GĦ + M*), 
with the (skew-symmetric) magnetization tensor 
0 Co P; Co Py Co F; 
—Co P, 0 —M, M, 


—coPy M, 0 —M, |’ 
—coP, -M, M, 0 


(Mi) = 


which extends the magnetization M to four dimensions. From this, we can easily 
establish the magnetization current density j,,. The decomposition 


j” ZY + jm” > 
with jm? = jJ” —j’ = Lo! 0, FH” — d,,G"” leads to 
jm’ = 3M” . 


Note that there is therefore a continuity equation for the magnetization current, viz., 
Oy jm” = 0. Then, 


X oP 
(in!) E (—coV P, = +Y xM). 


In electrostatics we have already encountered p = P — V - P and in magnetostatics 
j=j+ V x M. But the displacement current also contributes and results in the 
additional term dP/dt. 


3.4 Lorentz Invariance 243 


The matrices G,,, and M,,, can be derived easily from G”” and M“” according 
to p. 233: Gog = —G% = —Gyo and Gig = G'* = —G,;, and likewise for Muy. 

For given j ” and M“”, the skew-symmetric tensors F and G are thus determined. 
Out of two Maxwell equations, just one equation has emerged in four-dimensional 
space. 


3.4.8 Transformation Behavior of Electromagnetic Fields 


Under a Lorentz transformation, the fields E and B (D and H) do not behave like 
vector fields A”, but the electromagnetic field tensors F and G are indeed tensors of 
second rank: 


F” = a,x! ax!” FS, 


This system of equations corresponds to a matrix equation F’ = AFA. The antisym- 
metry of F = —F is transferred to F’ = AFA = —F', so only the six components 
with jz < v have to be determined. Since F is uniquely elated to E and B according 
to p. 240, and likewise F’ to E’ and B’, this means that the transformation properties 
of the fields can be derived using the matrices 0,x’“ mentioned on p. 232. Then for 
a system moving with velocity v, we have the fields 


E = E,, E,' = y (E1 +vxB 
; ; vxE 
By = Bi, Bi’ = y (Bi - col J 


These can be combined to give 


a] v-E 
E = y (E- 2 5 Z +vxB), 
y 


v2 


y-lv-By =) 
z) 


B'=y (B : 

y v Co 
Thus, the components of the electromagnetic field parallel to the velocity v remain 
unmodified, but not the perpendicular ones. In particular, in the non-relativistic limit 
y ~ 1, it follows that 


vxE 


E'xE+vxB, B ~B- z 


co 


(Note that the term v x B is well known, but not v x E/ Co” due to technical limi- 
tations: we can produce strong magnetic fields, but strong electric fields are found 
only in the interior of atoms—because it is actually coB that should be compared 


244 3 Electromagnetism 
with E in order to have equal units, i.e., 1 T= 3 MV/cm, we should have considered 
E+ $ x coB rather than coB — B x E.) Therefore, on a slowly moving electric point 
charge q an electromagnetic field acts with the Lorentz force 


F=q (E+vxB), 


and on a moving magnetic moment m, an electric field acts, because F = Vm- B 
leads to 


2] 


F=Vm.(B- - 


Co 


In particular for a radially symmetric central field, we have 


and hence v x E = r~! (d®/dr) (r x v). We thus arrive at the spin-orbit coupling, 
because there is a magnetic moment associated with a spin and r x v with an orbital 
angular momentum. According to this derivation, this is not a relativistic effect, 
despite what is often claimed. 

Correspondingly, we can now establish the transformation properties of D and H 
from the behavior of the tensor G, which is the same as that of the tensor F. We only 
need to replace E by co? D and B by H, which yields 


; y-lv-Dv vxH 
D' =y (D STRESE 
y v Co 
-1 v-H 
H =y (H % K = vxD). 
y v 


For the reverse transformation from the primed to the unprimed system, v is simply 
replaced by —v, giving 


-lv-E 
poivEy vip), 


E=y (E T 


Here the components of E, B, D, and H along v remain unchanged. 


3.4.9 Relativistic Dynamics of Free Particles 


From the velocity we derive the (mechanical) momentum: 


(pt) = m(v") Smy (co, V), with pp” = (mco) . 


3.4 Lorentz Invariance 245 


Here, m stands for the mass (a relativistic invariant), often called the rest mass, while 
my = m/,/1— B? is called the relativistic mass, even though the factor y belongs 
solely to the velocity—without it the velocity of light would not be the same in all 
inertial frames. It is thus a kinematic factor and has nothing to do with the mass. The 
zeroth component p° is connected to the energy: 


p=— = E=myey 


Note that the concept of the position—momentum pair corresponds to the time—energy 
pair (we neglect the potential energy and consider only free particles). The total 
energy E is composed of the rest energy mco? and the kinetic energy 


T = E — mco =m (y — 1) c = ġ mv? +- 2 


By specifying the rest energy, we set the zero-point of the energy, so that is no longer 
arbitrary. We have thus set: 


~ (FE f 2 
(:p") = (=. p) , with E =mycọ and p=myv. 
co 


With p, p“ = (mco)*, we conclude that (E/co)? —p-p= (mco)*, or again, restrict- 
ing ourselves to the positive square root, 


E = coy (mco)? +p- Pp. 


From the previous pair of equations we conclude 


(This holds for all m Æ 0, and hence we assume it also for m = 0.) For m Æ 0 and 
with v — co so that y > ov, E and p also increase beyond all limits. In contrast, 
for m = 0, the relation p,p” = (mco)* yields E = cop. This leads us to the unit 
vector p/p = V/co: in every inertial frame, massless particles move with the velocity 
of light. 

In order to derive the Lagrange function for free particles, we use the integral 
principles (see Sect. 2.4.8) and take into account the fact that the proper time t 
(but not the coordinate time f) is Lorentz invariant. According to p. 230, we have 
dt = y dr. Now Hamilton’s principle states that the action function 


ti Ti 
W = / Ldt= f yL dt 
to To 


246 3 Electromagnetism 


takes an extreme value. This must be valid for all reference frames. Consequently, 
y L must be Lorentz invariant. (However, we shall not introduce an abbreviation for 
y L). As explained on p. 250, L is connected to the Lagrange density - as used on 
p. 241. For free particles this function depends on the four-velocity, but not on the 
space-time coordinates. Then we only have to find out how y L depends on vv”. 
Hence we investigate the ansatz y L = m f(v,v"), bearing in mind the requirement 


We already had the first equation at the beginning of this section. The second connects 
two contravariant quantities and generalizes p = VL (see p. 99) with vu; = —v* to 
four dimensions. With dv,v"/du, = 2v”, this requirement can be satisfied by any 
function f with df/d(v,v") = —1/2. However, because vv” = co”, this does not 
seem to be unique. Hence the Lagrange function is often derived from Fermat’s 
principle, valid for free particles according to p. 141, or the geodesic principle, 


ti s1 
af dt=0, or af ds =0. 
to SO 


(For free particles, the velocity is constant, so the two expressions yield the same 
orbit.) If now o increases monotonically with the proper time t, but otherwise is an 
arbitrary parameter, we have 


ds Suv —— da do do. 

Here the coordinates x” and their derivatives can be varied. Since the parameter o 
does not need to be equal to the proper time, the inconvenient condition vyv” = co” 
does not apply for the variation. On the other hand, it may be equal to the proper time, 
and then the expression under the square root is equal to vv". Consequently, y L is 
equal to the square root of vv”, up to a fixed factor, and this factor we derive from 
the requirement that, in the non-relativistic limit, we should have L ~ T+const. with 
T = 5mv-v: 


_ mco 
—./v,v" = —mcoy co? — V - v mo + smv- 


Since here (for free particles) the Lagrange function does not depend on the 
space-time coordinates, the Euler-Lagrange differential equation (p. 96) yields 


dp” = 
dr 


3.4 Lorentz Invariance 247 


8 p/mo 


-0.5 


-1.0 0.0 0.5 1.0 £ 


Fig. 3.29 Lagrange function and momentum of free particles as a function of 6 = v/co: non- 
relativistic (dashed blue) and relativistic (continuous red) 


and hence also the energy and momentum conservation law for free particles (see 
Fig. 3.29). 


3.4.10 Relativistic Dynamics with External Forces 


In classical mechanics (see Sect. 2.3.4), we have already derived the generalized 
potential U for the interaction of a particle of charge q with an electromagnetic 
field, namely, U = q (® — v- A). After multiplying by y, this expression is Lorentz 
invariant: 


yq(®-—v-A)=qv,A". 


Here A” depends only on the space-time coordinates x”, but not on v“. Hence we 
obtain the Lagrange function 


Le MCo/V,v" + qu, A” 
7 . 


This yields the canonical conjugate momentum 


dyL 
pac =mv'+qA". 
H 


We have already considered its three-space components on p. 99, though not yet rela- 
tivistically, and distinguished between the mechanical momentum and the canonical 
conjugate momentum. Its time component p° is related to the energy E = co p°, 
which now (with suitable gauge, see p. 124) also contains the potential energy q®, 
with A? = ®/co, according to p. 239. 


248 3 Electromagnetism 


Important for the Lagrange equations is 


dp” dv” dA” . dA” 4 
= , with ——=0,0°A", 
dt dt dt dt 


because with p = VL, this must be equal to —d” y L. With the above expression for 
the Lagrange function, it follows that —d" y L = q v, ð” A”. Then we arrive at the 
electromagnetic field tensor (see p. 240) 


dp” ðyL dv” 
oF | =— Y => m oe = quy FH . 


dr OX, 


Here, for u = 0, v, F” is equal to (—y v) - (—E/co) = y v - E/co, and the space 
components can be combined into the three-vector y (E + v x B). F” = qu, F”” is 
referred to as the Minkowski force: 


dv” 


FY = m—. 
dt 


Its space components are greater by the factor y than those of the Newtonian force. 
Its time component is related to the power y j- E. 
The last equation also holds for forces other than electromagnetic ones. 


3.4.11 Energy-Momentum Stress Tensor 


We would like now to extend Maxwell’s stress tensor to four dimensions. To this 
end, we go from the Minkowski force F” = qu, F"” over to a force density: 


Parr 
With ojs = 3“ Fey, we have pof” = (8 Fey) FH” = a (Fey FH) — Fey 08 FH. 


The last term can be rewritten, because F is antisymmetric and the summation 
indices « and v may be renamed: 


K v 1 v K KEV 
Fey OEP = —5 Fey lo FH" + 9“ F”). 
It is then simplified using the Maxwell equations: 


Fey EFH = oid F = 1 3H F™ Fay . 
Consequently, with 0“ = g"_.0*, we find uo f” = 0% (Fey F” — 1 g" F™® Fiy). 
Therefore, the force density f” is the (four-dimensional) source density of a sym- 
metric tensor: 


3.4 Lorentz Invariance 249 


IgH FY Fiy = F“, Fe 
Ho 


ft =-aT , with TS = = TM. 


If we restrict ourselves to D = éegE and B = joH, then we can extend Maxwell’s 
stress tensor, introduced on p. 215, with elements Ty = w — e9E,; Ex — Uo Hy Hy 
and T,, = —é)E, Ey — oA, Hy (and cyclic permutations), with the energy density 
w= 5 (Œ - D + H - B) and the Poynting vector S = E x H, into four dimensions: 


w  S4/co Sy/co Sz/Co 
Sx/co Tex To T, ; 

TH = x XX xy XZ th t T = TH = . 

( ) Sy/co Tyx Ty Ty Bi ‘ 3 = 
S./co Ta Ty TT 
The stress tensor known from the static case is now completed with the Poynting 
vector and the energy density. According to p. 215, S/co? is a momentum density, 
whence T is referred to as the energy-momentum stress tensor. Its space components 

are 
j 1 əsi arik 

fits 


— =0. 
co? ðt əðxk 


In addition, F? = jy F” = j- E/co,so —j - E = co 7 = 3w +V- S. We already 
know this equation (p. 211) as Poynting’s theorem. 


3.4.12 Summary: Lorentz Invariance 


Maxwell’s equations ensure the same vacuum velocity of light in all inertial frames: 
the laws of electromagnetism are Lorentz invariant. The space-time description must 
be adjusted to this fact, something that leads to unusual consequences for high veloc- 
ities. Just as time and space have to be combined to give x” = (cot, r), so also do 
charge and current density to give j” =(cop,j), energy and momentum to give 
p” = (E/co, p), scalar and vector potential to give A” =(®/co, A), and angular 
frequency and wave vector to give k” = (w/co, k). By building skew-symmetric ten- 
sors F“” and G”” from E/co and B and from coD and H, respectively, the pairs of 
Maxwell equations for microscopic electromagnetism can each be combined into 
one equation, viz., 


Or +F +P F = 0 and aF” = pol”, 
and those for macroscopic electromagnetism into 


v 


FH +3 F +0 F =0 and 3G” =j 


250 3 Electromagnetism 


In addition, the equation f” = —d,,7"” with the (symmetric) energy-momentum 
stress-tensor 


m igi? Fe. Fe _ FH, F” 
Ho 


eee 


combines Poynting’s theorem and the relation between force density and Maxwell’s 
stress tensor. 

Lorentz invariance leads to the fact that, in classical mechanics, derivatives with 
respect to time must be replaced by derivatives with respect to the proper time, 
thereby introducing the factor y. In particular, for free particles of mass m, we have 
the momentum (p”) = m(v“) =my (co, v) with p° = E/co, or E = myco and 
p = co™? E v, and otherwise for particles with the charge q, 


ðyL 
ðv, 


H — 


=m +q A". 


In the expression (p“) = my (co, V), the factor y belongs to the velocity, not to the 
mass—this is a Lorentz invariant, as is v„v“, but only because of the factor y. There 
is no “velocity-dependent mass” (see L.B. Okun: Phys. Today 42, 6 (1989) 31-36.) 


3.4.13 Supplement: Hamiltonian Formalism for Fields 


On p. 241 the Lagrange function known from the mechanics of particles was extended 

to the Lagrange density -Z for the electromagnetic field. Here we present the tran- 

sition to the Hamiltonian formulation, which is often applied to field quantization, 

even though there are other ways to derive the latter, as we shall see in Sect. 5.5.2. 
After introducing the Lagrange density &, Hamilton’s principle reads 


ð 
8 f atst 2e, n, =O, with Len 


where the coordinates x” are given and the parameter (or parameters) 7 of the system 
are to be varied. For the electromagnetic field, 7 is equal to the four-potential A. 
Therefore, with the Einstein summation convention, we may set 


IL IL 
= — ôn + —— ô(3ð ; 
a eG A 


6L 
using the abbreviation 0,7 = dn/dx". We may change the order of the derivative 
with respect to x” and the variation, i.e., 6(0,,.7) = 05n/dx“, and integrate by parts. 
However, here 7 depends not only on the single coordinate x”, but also on the three 
remaining ones, and therefore the implicit dependence of the field quantity n on the 


3.4 Lorentz Invariance 251 


x” must also be accounted for, although in many textbooks there is only the partial 
derivative instead of the total derivative in the next equation: 


av abn IL d y av 
u = Hy a ee 
[ou IOM Ox” (On) a [fo a Ga) ôn , 


where there is of course no summation over u. Since 7 is to be kept fixed at the 
integration limits during the variation, the first term on the right-hand side vanishes. 
Hence Hamilton’s principle appears in the form 


and we obtain the Euler-Lagrange equations 


aL d aL 
ðn dx A(A,n) ” 


with Einstein’s summation convention. This we may also write as 


dav af d av 
dt an dn  dxk A(dn) ` 


The similarity with the usual equation appears more clearly if we use the Lagrange 
function L instead of the Lagrange density 2”, but now take it as a functional of the 
functions 7 and 7, introducing the functional derivatives 


òL IL d aZ ôL IL 
— and a iS ea 
ôn dn  dxk 3(ðkN) ô) an 


If we divide space into N cells and discretize to give L = ae Z AV, it follows 
that 


N 


x = sam) ön + Bin] Ai 


Since the variations 67; and 67; withi € {1, ... N} may be performed independently 
of each other, the limit AV; — 0 can be considered separately for each cell. The 
functional derivative 8.Y /8n still contains a factor (AV)~!. Therefore the Lagrange 
density -Z appears on the right. Hence the result reads simply 


d ôL ôL 


dt öh n” 


252 3 Electromagnetism 


which is similar to the normal Lagrange equation. However, because of the functional 
derivatives, we are now dealing with a differential equation, from which we must 
now determine 7(t, r) rather than x(t)—instead of the coordinates x (possibly very 
many, but nevertheless a finite number), a whole field must now be determined. 
The quantity canonically conjugate to the field quantity 7; in the volume AV; is 


OL IL 


; = = AV; = AV, 7; , 
pee oie? 
with the momentum density 
_ df ÖL 
an BH 


where zz; is its mean value in the volume A V;. If we go over from the Lagrangian to 
the Hamiltonian mechanics then, with #(x", n, 2, ðn), we also have 


rr fox , aa ea 4 0 ang 8 eg ) 
= — daz ; 
an an ain 


We integrate the last term by parts (without the summation convention): 


OH ð OH d IH 
fa a( 2) = dn fa dn. 
(ain) \dx! a (dın) dx! ə(Ən) 
The integrated term vanishes if the considered system exists only in a finite volume, 
as we have assumed. Hence, 


dH = fe ad i + al 
an dx! O(n) On i 


with the summation convention. Instead of the round bracket, we may also write the 
functional derivative 5H /67. On the other hand, the relation 


L . 3L ar) 


an = | ex! (x dù) + ù dx T 


follows from # = n ù — Z with m = 0L/d7. Here, since 7 = L/S), the first 
term cancels the fourth. If we also use 6L/8n = 7, then we may set 


dH = fer OF at —% dn + ar) , 


Comparing with the expression found above, we obtain the Hamilton equations for 
a field, viz., 


3.4 Lorentz Invariance 253 


0H af 65H . ôH . 
= > = T, and — =l; 
ot ot oa 


because # does not depend on the spatial derivatives of 7 , and therefore 5H /6z = 
IH ðm. 
The Hamilton function H is a conserved quantity if dH /dt vanishes. Clearly, 


dH = | dx* IH 
dt , 
because dn/dt = 7 and dx /dt = 7r cancels the remaining terms of the integrand. 
The time dependence of an arbitrary quantity O can be obtained from 


dO f 4,/80. 80.\ 30 
re u a 
60 6H 60 67 dO dO 
= | xt = H]}+ —. 
J x S Be n) s Ve 


For the last equation we have extended the concept of the Poisson bracket to fields, 
as an abbreviation for the preceding integral. 
The Poisson bracket [n;, pi] = 1 of particle physics has become [n;, m;] = 1/A V; 
in field theory. For the limit AV; — 0, 
[n(t, r), w(t, r’)] = ôr ~ r’) ’ 


and after a Fourier transform [n(t, k), z(t, k’)] = ô(k — K’). 


3.5 Radiation Fields 


3.5.1 Solutions of the Inhomogeneous Wave Equations 


Now we turn to the potential equations of microscopic electromagnetism from 
Sect. 3.4.6 (with the Lorentz and Coulomb gauges): 


A" = Ho Je » Or A" = Ho Jans ; 


Here the inhomogeneities may also depend on time, because otherwise we just obtain 
the already known static solutions. We solve both equations with the same Green 
function, since they involve the same differential operator L and differ only in the 
inhomogeneity. This Green function generalizes the expression (for the Laplace 
operator A) known from statics. In particular, it takes into account the fact that space 
and time are connected with each other via the velocity of light co: 


254 3 Electromagnetism 


6(t/ —t+|r—r’|/co) 
4n|r—r’| 


=ir=Pisr="9. 


So far we have considered the limit co —> oo (O —> —A) and we were therefore 
allowed to omit the delta function 5(t — t’) on the left- and right-hand sides. We 
shall use only the Green function with the plus sign: the source at the position r’ acts 
at the chosen point r after the lapse of time t — t’ = |r — r'|/co. This is called the 
retarded solution. The Green function with the minus sign is known as the advanced 
solution. It is mathematically but not physically allowable, because effects then occur 
before their cause. 

Before proving the validity of these Green functions, we first show their Lorentz 
invariance. If we use 5{(x, — Xp/)(x" — x"“)} = ô{(cot — cot’)? — |r — r’|?} and 
take into account the equation on p. 20, viz., 


(At = |Ar |/co) + d(At + |Ar |/co) 
29 |Ar| 


’ 


5{(coAt)* — |Ar ?} = 


it follows that 


8(t'—t + |r—r'|/co) 
\r—r'| 


= 2co e{4(t—-1')} {Xu xu) a" —x)} . 


Here, the step function e{+(t—t’)} seems to violate Lorentz invariance, but we wish 
to distinguish uniquely between past and future, and therefore restrict ourselves 
to the retarded solutions, that is, to proper Lorentz transformations. Forwards and 
backwards light cones then remain separated. 

For the actual proof, we use the Fourier representation of the delta function (see 
p. 21) with R = r — r’ and k = œ/cọ, i.e., 


b(t’ —t + R/co) 1 fa exp{iæ(t' — t + R/co)} 
= w 
R 20 J- R 
1 f” B aw exp(+ikR) 
= =f. do exp{ia(t’ — t)} ( a A) ae 


The d’Alembert operator in the “time representation” then becomes —(A + k?) in 
the “frequency representation”. Now in the general case (the special case k = 0 was 
already considered on p. 26) 


exp(+ikR) = 4r 8(R) 
a S . 


(A +k’) 
According to p. 39, for R #0, the left-hand side is equal to R~!(07/0R* + 
k?) exp(+ikR), hence zero. However, for R = 0, it is singular, and its volume inte- 
gral, according to p. 27, is equal to —4z. 


3.5 Radiation Fields 255 


Thus, we have for the Lorentz gauge, 


b(t’ — t —r’ 
A(t, r) = m farav' pya j Er E 
4r Ir—r'| 
= m favi jt(t — |r — r'|/co, r’) 
4r Ir—r’| 


as the retarded solution. The continuity equation already ensures the gauge condition 
0, A" = 0. If we use JS dt expliwt) ô(t' — t + R/co) = exp(iwt’) exp(ikR) for 
the Fourier transform, we obtain the expression 


1 f’ : Ho exp(ik|r—r’|) 
A” (w, r) = —— dt A“ (t, t) = — | dv’ j lo, r) ——_— 
(œw, r) ml (t, r) expG@r) fo f j” (œr) 


|r—r'| 


Note that we take exp(iœt) and not exp(—iæœt), since that leads us to wt —k-r = 
k,,x"—of course, j” (œw, r) is related to j”(t, r) via the same Fourier transform. 


Hence the source density is easy to determine, since V f (|r — r'|) = —V' f (|r 
r’|): 
exp(ik|r — r’ 
V-A(w,r) =- | dv’ j(o,r’)- v _— | 7 : 
r-r 


With j- V'G = V’- Gj— GV’ - j, we can split the integral into two terms. The 
first can be converted according to Gauss into a surface integral and does not con- 
tribute, since j vanishes on the surface, while the second can be rewritten with 
the continuity equation, because using p(t, r) x p(w,r) exp(—iwt) and j (t, r) « 
j (œ, r) exp(—iat), it reads 


; : iw 0 
V -j(œ,r) = iw p(w, r) = — J (w,r) . 
co 
Consequently, 
V-A(o,r) =~ A%o,r), 
Co 


and hence also 9, A” = 0. In the given expression for A”, the continuity equation 
0, j" = 0 already ensures the Lorentz gauge. 

For the derivation of ð, A” = 0, the current density must vanish on the surface of 
the integration volume. For the Coulomb gauge, only the transverse current density 
is of interest (transverse gauge): then it is already sufficient that the current density 
should not have a normal component there. Then the source freedom for the Fourier 
transformed A(q, r) is easily checked. As in Sect. 3.2.8, we use Gauss’s theorem, 
assuming no current density at infinity and the source freedom of j,,,,,- AS A(@, r) is 


256 3 Electromagnetism 


solenoidal, this is true also for A(t, r). Inthe Coulomb gauge, because A® = —p/€9, 
we have 
t,r 
tins - a fa v’ ee 2, 
ôt —t+ |r—r’ 
At, r) = 2 fay Veo E Ico) 
4r Ir—r’| 
and after a Fourier transform 
1 
O(w,r) = gr [av er) ry! 
klr—r’ 
A(o, 1) = S ahaor SPORE D 
4r Ir—r’| 


We would like to use these expressions for the radiation fields, which is why the 
Coulomb gauge is often also called radiation gauge. The fact that the radiation is 
transverse is more important for us than Lorentz invariance. For this reason, the 
radiation gauge is also used in quantum electrodynamics (see Sect. 5.5.1). 


3.5.2 Radiation Fields 


For the magnetic field, with B(w, r) = V x A(w,r), we obtain 


mo g SPÜkIr = r') 
B(w, r) = = Ar dv’ Juans (, T N x eae 


’ 


with 


exp(ik|r -= r') (i 1 ) exp(ikjr—r’|) r—r’ 
[ee] Ir —r'| Ir—r’| lr—r’| - 


We thus have two terms with different position dependence. For time-dependent 


problems (thus k Æ 0), the field decays more weakly with the distance from the 
source than in the static case. This is also shown by the representation 


B(t,r) = do B(w, r) exp(—iar) , 


coe 


because we have 


3.5 Radiation Fields 257 


Fig. 3.30 The approximation |r — r'| ~ r — e, -r’ valid forr >> r’ follows by calculation (using 
a series expansion) as well as geometrically. The circular radius |r — r’| and the double-headed 
arrow are nearly equally long 


1 “ ‘ot ik 
zS dw Jeans (, r’) ee = Jirans (t E R/co, r’) , 
—00 
=A . do oj lw r’) eit eikR _ Dtrans (t = R/co, r’) 
trans 3 = 


V2 J- ot 


and hence, with t’ = t — |r — r'|/co, 


Ho (ta r’) =) r-r’ 
Bt, r) = — | dV 3 
én al co Ot = Ir—r’| E |r — r'|? 


for the magnetic field. Previously, we took the derivative with respect to the position 
instead of the time and thereby could not account explicitly for the finite propagation 
velocity. 

Since the current density is connected to the velocity, the part with the derivative of 
j with respect to time is called the acceleration field and the second the velocity field. 
With increasing distance from the source, the acceleration field clearly contributes 
most to B. 

For the electric field, we conclude from E = —V ® — 0A/dt that 


E(w, r=-V (w, r) + iwA(a, Vr) 


1 j ; 1 lw , j 
= dV (po, r’) V— - — duns; r) 
4T £9 Ir—r’| co? à 


exp(ik|r — a 
lr—r’| 


and thus after the Fourier transform wœ —> t 


: / t 
E(t, r) = -z f veer’ V 1 Əjirans É r s/t) , 
Ar £0 Ir—r’| co? |r -r'| 
noting that we have ¢ in the argument of the charge density, but t’ = t — |r — r’|/co 
for the current density. Here, we write first the longitudinal and then the acceleration 
field, even though the acceleration field is more important for greater distances. 
For large distances of the chosen point from the source (r >> r’), we may set (see 
Fig. 3.30) 


258 3 Electromagnetism 
klr—v'|k (r—e,-r') =kr—k'-r’, with k’=ke,. 


Then, using e, = k’/k, 


ik exp(ikr) i P Bae aah 
4re9 E(w, r) ~% — —— |] dV (w,r ) exp(—ik’-r’), 
Co r 


J trans 


coB(@, r) © e, x E(@,r). 


In agreement with Fig. 3.19, the vectors e,, E, and B are mutually perpendicular to 
each other for r >> r’, because with e, = k’/k, we have 


E(,r)-e, x J AV! jrans(@, r’) - kK'exp(—ik'- r’), 


and also jirans © K'exp(—ik’ + r^) © jirans © V’ eXp(—ik’ - r’), whence generally, 


Jeans -WG=V'. Ghrans -GV'. Juans $ 
Thus the volume integral vanishes, because there is no current at the surface and Jirans 
is solenoidal. 

In the following we shall often use the Fourier transform (see p. 25 and p. 255) 


1 
Jeans (œ, k) = 3 fa exp(—ik ý r) Jiss (Œ, r) 
~V 2n 
1 
F (27)? fo a exp{i(ot —k-r)} Jirans > r). 


In particular, we have just obtained the electric field strength at large distances, viz., 


m ik exp(ikr) , p 
E(a, r) x 2 rer — Juans (@, k) > with k=k €, 


and also the magnetic field with coB(w, r) ~ e, x E(@,r). 


3.5.3 Radiation Energy 


Now the Poynting vector can be related to the properties of the radiation source. To 
this end, we make a Fourier expansion 


E(t, r) = dw exp(—iot) E(a, r), 


Fal 


3.5 Radiation Fields 259 


(likewise for H) and obtain for the Poynting vector S = E x H integrated over the 
time (according to the Parseval equation on p. 23) 


i dt sin = f dw E*(o,r) x Ho, r) , 


because E(t, r) and H(t, r) are real functions. Hence, with E(w, r) = E*(—a, r) 
and H(w, r) = H*(—a@, r), we find 


i dt S(t,r) = are | dw E*(o, r) x H(@,r) . 
—0o 0 


Far from the radiation source, i.e., beyond any magnetization, so H = B/ xo, the last 
section now yields 


è 1 P T k? à 2 r 
E* (w, r) x Hœ, r) ~ — E* (œ, r) - E(w, r) e, © = — |jrans(@, |" = - 
Hoco 2 £0Co r 


With k*/e9 = ow”, the Poynting vector integrated over all times is therefore asymp- 
totically equal to 7 o r/ (cor?) tor dw œ? |jrans(@, k)|*. The energy (in joule) flowing 
into the solid angle element dQ = r - df /r? is therefore 


= T Mo 5 2 je 2 
dW = df - dt S(t,r) = =— d2 | dw a? |jyans(@.k)| 
—oo co 0 


with k = k e,. Here Juans is the solenoidal part of the current density, for which, 
according to Sect. 1.1.11, we may also write Jirans (œ, K) = ex x {j(œ@, K) x ex} with 
ex = k/k. Hence |juans(@, k)|? = |j (œ, k) x k/kI?. 

If the frequency range is very sharp, then it is best to work with a single angular 
frequency @. However, the time integrals then diverge. For a continuous radiation 
source, we should consider the radiation power averaged over a period: E(t, r) = 
Re {E(@, r) exp(—i@ t)} and the corresponding expression for H(t, r) lead to 


=, = 22/0 E* © H(@ 
S= =f A A a a 
20 0 2 
Aa T Mow [Jiran ©. k)|? k 
4 r2 


Therefore, for the average radiation power, we obtain 


I Ho 


—2 |; ps 2 
Aco Q litcans ©. k)| dQ. 


dW =S-df ~ 


This generally depends on the direction of k—some examples are given in Sect. 3.5.5. 


260 3 Electromagnetism 


3.5.4 Radiation Fields of Point Charges 


For point charges q, the Fourier transform with respect to frequencies does not make 
sense. Here, it is better to use 


u(t r’ 
Y 


for the four-potential A” (t, r). The factor y7! is necessary because of the Lorentz 


contraction. According to p. 255 and with r’ as a unique function of t’, the Lorentz 
gauge is 
vet’, r) ôt —t+ |r—r’ 
A“a,r = 20d far Er’) ôl | |/co) 
4r y Ir—r’| 


For the delta function, we use the abbreviation 
R=r-r’, e= R/R, = v/o, 


and set u = t' — t + R/co. Then dR/dt' = —v and dR/dt' = —v-e, so we have 
du/dt' = 1 — $ - e, implying dt’ = du/(1 — B-e). Then, because dr’ ô(t'— t + 
R/co) = du ô(u)/(1 — B - e) for the Lorentz gauge, we find the Liénard-Wiechert 
potential 


oq v'(t — R/co, r — R) 


A” (t, r) = 
4r y (R—B-R) 


For the corresponding equations ® « cy and A œ v, we have (A#) = (c07! Ọ, A) 
and (v“) = y (co, v). The factor y then cancels out. The (retarded) fields spread 
with finite velocity, and therefore depend on A” and v” at different times, depending 
upon the distance R (see Fig. 3.31). Here, with (B”) = y (1, B) and (R”) = (cot — 
cot’, r — r’), y (R — B - R) can also be written as the scalar product 6, R”. This does 
not depend on the reference frame—since emitter and receiver move against each 
other, R alone would not make sense. In fact, information is not radiated evenly in 
all space directions, but preferentially in the direction of the motion. 

For the fields E = —V® — dA/dt and B= V x A, we also have to take into 
account the retardation effect. Instead of the derivative 0/dt, we should take the 
derivative d/dt’, and for V, keep t’ (but not t) fixed. If, as in Sect. 1.2.7, we indicate 
the fixed quantity by a subscript on the differential or operator in brackets, we have 
(V) =(V)y + (Vt), 0/dt’. In order to find (Vt’);, we determine the action on 
R=co(t —t’). This is (V R); = —co(V?t'), and (VR), = e, whence 


e ð 
(1—B-e)c dr’ 


(Y) = Y) 


3.5 Radiation Fields 261 


Fig. 3.31 The existence of a point charge (e) in the spherical shell becomes noticeable only after 
the time At = R/co. During this time the point charge will already have displaced by vAr, but that 
becomes observable only later on the spherical shell. The associated Lorentz invariant {y (R — £ - 
R)}~! = 1/,R’ is sketched here as a continuous line for B = $, which is the weight factor in the 
Liénard-Wiechert potential 


and 


OR ( ) OR ot’ ee or’ = ot oo 

— =C = — . =< = = . F 

ot 0 ot ot’ ðt ot or’ 

From the above expression for ® and A (independent of the gauge) and with V (£ - 
R) = B, OR/(codt') = —B - e, and 4 (B - R)/(codt’) = B - R/co — 8?, we obtain 


q 1 a ee 
Arey (R-B-RP\ y? co 
coB(t, r) =e x E(t,r). 


E(t, r) = 


The second term here decays more weakly by one power of R than the first, but 
occurs only for accelerated charges: it describes the acceleration field, and the first 
the velocity field. On the right here, all quantities are to be evaluated at the retarded 
position of the charge. The magnetic field is always perpendicular to the electric 
field. 


3.5.5 Radiation Fields of Oscillating Dipoles 


Let us now investigate a dipole oscillating with the angular frequency w, with the 
maximum dipole moment P. In the coordinates t and r, we may then replace j = pv 
by p. In the equation for B(w, r) from p. 256, we therefore use the expression —iwp 
as the Fourier component of j (œ, r): 


B(o,r) = 


i ly . ik 
iow (ix B *) TE exp(ikr) 
T r 


262 3 Electromagnetism 


The magnetic field is thus perpendicular to r and P. For P = pez, B(w, r) has only 
ay component (proportional to P sin 0). From £ọ 0E/dt = V x B/uo, we then also 
have the associated electric field (outside the origin): 


ico? 
Elw, r) = — V x Bia, r) 
(62) 


1 ik Ii 3ik Jya ik 
= (etti e *- a SPO. 
r r r 


4T £9 r r 


With p = P e;,, this vector has an r and a 0 component. 

We derive the picture of the field lines from dr x E = 0, because dr must have 
the direction of E and may be written e, dr + eg r dO, then express E œx V x B in 
spherical coordinates (see p. 39). Since for our choice of the dipole direction, B has 
only a g component, we find (independently of time) 


a a 
zy C sin 8 By) dr + = (rsind By) dé = 0. 


This differential equation has the solution r sin@ Bọ = const., where according to 
the first equation of this section, Bọ is complex, and so also is the constant. This 
result is exact and does not rely on approximations—in particular this is not par- 
titioned into near-, middle-, and far-zone, which would be useless in our search 
for zeros. If we now split into real and imaginary parts and set po = kr, then, 
becauser By x (i — p') sin @ e”, we find sin™? 0 œ cos p — sin p/p and sin™? 6 œ 
sin p + cos p/p. The spherical shells with p = tan p or p = — cot p = tan(p + 4) 
thus belong to the set of solutions: there the curl densities reverse © <> © and, accord- 
ing to the induction law, so also does 0B/dt, which means that B is extremal there. 
It vanishes everywhere on the axis 0 = 0 in the direction of the dipoles. Figure 3.32 
shows the electric field lines at two times, and Fig. 3.33 the magnetic field lines at 
the same times. 

The distance between the spheres decreases continuously and is only equal to 
2/2 in the far-zone, because the factor i— p~! can also be written in the form 
iy 1 + p~? exp(iarctan p~'), leading to the spherical harmonics of pọ + arctan p~!. 


3.5.6 Radiation Power for Dipole, Braking, and Synchrotron 
Radiation 


For the radiation power at sufficiently large distances from the source, only the 
acceleration field is of interest. According to p. 261 for point-like sources, it is given 
by 

log {v x (e— B)} xe exE 


Ez : d B® ; 
in l= peer * co 


3.5 Radiation Fields 263 


æ æ æ ee æ ee m m Oe ww we we we wt we a- 


Fig. 3.32 Electric field lines of the Hertz dipole at two times. At the circles (dotted lines), the curl 
densities reverse © <> ©, and so also does 0B/dt, implying that B is extremal there 


Fig. 3.33 Magnetic field lines of the Hertz dipole at the same times as in Fig. 3.32, here in an 
inclined view of the central plane. At the dotted lines, B is extremal again, while at the dashed 
lines, the field direction is reversed 


with R = |r — r'|. Then also 


E? ex n i {(v x (e— B)) xef 
[Loco co \4a (1 — B-e)® R? 


We shall use this expression (or dw /dQ = R? S -e) for various examples. 
In particular, for low velocities (v < cp, i.e., B < 1), it follows that 


Hoq (Vxe)xe a 
ER Si, ea x e, 
An R co \4ar R2 


and thus for the radiation power into the solid angle element dQ, 


dW 2 
x ($) (xe). 

dQ Co 4r 

The radiation thus depends on the angle between ý and R through sin? 6. There- 
fore, with 27 fy sin@ d0 sin? @ = 2 Ta dcos@ (1 — cos? 0) = $ 4r, the integra- 
tion over all directions yields the Larmor formula 


264 3 Electromagnetism 


Ho q? 


6 co 


for the total radiation power. Here V is to be taken at the retarded time t’ = t — R/co 
(as are all the remaining quantities). 

Due to this radiation power, the oscillation is damped. This is referred to as the 
radiative reaction. In order to calculate it for a (nearly harmonic) oscillating charge 
q with mass m, we use the results on p. 99 to relate the decay constant y = a/(2m) 
to the radiation power W =av’, obtaining y = W /(2mv?). Since the ratio of the 
acceleration and velocity amplitudes for a harmonic oscillation is given by the angular 
frequency w, we conclude that the decay constant is 

= Ho g wo 
aa 6 co 2m i 


This derivation assumes weak damping, viz., y < w. For electrons, œ < 3 x 108 
PHz must hold, and this is true even for visible light, where w is a few PHz. Fourier 
analysis supplies a not quite sharp frequency: the decay constant leads to a natural 
line width. For heat motion, this is also modified by the Doppler effect. 

The last equations are valid only for v < co. This condition is always satisfied 
for the oscillating dipole (with sufficiently small displacements). If p is its dipole 
moment (at the retarded time), we have to set gv = p. For a harmonic oscillation 
with angular frequency w, p = —w?p, and with the maximum dipole moment P, we 
find, for the radiation power averaged over a period, 


W= op, 
T Co 
because the square of the spherical harmonics is on average equal to 1/2. The radi- 
ation power (and thus the scattering power) increases as the fourth power of the 
frequency. Applied to the scattering of sunlight in the air, since @plye © 2@y;ea, this 
explains the blue sky and the red dawn and dusk. 

Dipole radiation is linearly polarized. It oscillates in the plane spanned by v and 
e, and perpendicularly to e (transverse). 

If we now give up the restriction to small velocities, then for the calculation of the 
radiation power, we have to account for the retardation. In a time unit, energy is lost 
at the rate —d W /dt', while S is the energy current density at the position r and at 
time t. Therefore, it still depends on dt /dz’ (see p. 261). Hence, with W = dW/dt’, 
we find 


2a ) {(v x e- B)) xe} 
d2 cy \4n (1— B-e)> , 


We shall now consider this expression for two special cases. 


3.5 Radiation Fields 265 


Fig. 3.34 Polar diagram for braking radiation. Left: For Bo © 0. Right: For Bp = a The arrow 
specifies the direction of the original velocity. The difference in size of the two pictures is to indicate 
the intensity difference, even though it is not at all to scale, because in total the energy smug is 
radiated off on the left, and (yo — 1) mco? on the right, with vo? < co. The plane perpendicular 
to v appears as a cone due to aberration, and is indicated by dashed lines. cos 6’ = 0 corresponds 
tocos@ = B 


To begin with we assume a longitudinal acceleration (deceleration) v || v. Then 
v x B vanishes, and we obtain 


poq (Vxe)xe koray (xe)? 

and S= ( ) e 

4n (1—B-e)? R co \4r/ (1 — B-e)® R? 

The electromagnetic field thus differs only from the one for v < co by the factor 
(1 — B - e)~*. Consequently, the radiation into the forwards direction is even stronger 
than expected non-relativistically. 

Let us consider braking radiation (also known as deceleration radiation or 
bremsstrahlung) as an example. Here, v is constant: the velocity decreases at a con- 
stant rate from vo to zero. From dv/dt’ = ù, and hence ù? dt’ = ù dv, the energy 
radiated into the solid angle element is (see Fig. 3.34) 


dW wo/(a\?..2, f” dv 
— = (4) v sin“ 0 
dQ co \4a o (—vco7!cosé)> 


= M4)" = 1 1) 
~ 4 \4qn cos@6 \(1 — vo co~! cos 0)4 i 


Of course, this relation holds only for truly constant deceleration. 

The linear polarization of braking radiation is given by E œ (ý x e) x e, as for 
dipole radiation, thus in the plane spanned by v and e and perpendicular to e. 

For synchrotron radiation, the acceleration is perpendicular to the velocity, i.e., 
v- B = 0. This leads to a radiation power 


dw 2 1 S Ge? 
aa a (ar) (—B-e) (mie a 


266 3 Electromagnetism 


Fig. 3.35 Polar diagrams of the synchrotron radiation. Left: For Bo © 0. Right: For Bo = 1/2. 
Continuous line: In the plane of the trajectory. Dotted line: Perpendicular to the trajectory. Dashed 
lines: In-between in 15° steps. The arrow specifies the direction of v, and the plane perpendicular 
to v is indicated. Here only the line intersecting the plane of the trajectory is important (compare 
with Fig. 3.32, where the direction of the acceleration is likewise shown dashed) 


once again more into the forward direction in comparison with the non-relativistic 
limit. 

The linear polarization of synchrotron radiation lies in the plane spanned by v and 
e — B, in particular, perpendicular to e, because 


Ea {vx (e—f)}xe=v-e(e—f)—-(1-B-e)v. 


The particularly intense radiation in the tangential direction v (i.e., fore L y) has 
field strength (see Fig. 3.35) 


Ho v 


E ~ x 
4r (1—B-e)?R 


Here, the electric field thus oscillates in the plane of the trajectory. 


3.5.7 Summary: Radiation Fields 


In this section we have investigated the coupling of the electromagnetic field with 
its generating sources, and to this end we have appropriately extended the solutions 
known from the static cases. Here, retardation becomes important. The result has 
been that the field due to an accelerated charge decreases more weakly by one power 
of the distance than for a uniformly moving (or resting) charge. At large distances, 
only the acceleration field is important for the radiation field. Its properties have been 
considered in the last section for various special cases. 


Problems 267 


Problems 


Problem 3.1 Reformulate V (a - b) and V x (a x b) such that the operator V has 
only one vector to the right of it (on which it acts). Here the intermediate steps should 
be taken without components and the differential operator should treat both a, and 
b, as constant, so that the product rule reads V (a - b) = V(a- b.) + V (a. - b), or 
again V x (a x b) = V x (a x b.) + ---. The equations a x (b x c) = b (c- a) — 
c (a - b) = (c- a) b — (a- b) c need not be proven. (4 P) 


Problem 3.2 Using Cartesian components, determine V - r, V x r, and (a- V)r. 
These results will be useful for the following problems. (3 P) 


Problem 3.3 Consider an arbitrary (three-times differentiable) scalar function y (r) 
and the three vector fields Vy, r x Vy, and V x (r x Vw). Which of them are 
source-free and which curl-free? Determine the source and curl strengths as functions 
of Y. What is their inversion behavior (parity) if Y (—r) = y (r)? (9 P) 


Problem 3.4 Prove Jy) (a -a)b= /,, dV {b (V-a)+(a-V)b} for arbitrary 
fields a(r) and b(r) and show that the volume integral of a source-free vector field 
a is always zero, if a vanishes on the surface (V). (4 P) 


Problem 3.5 For which function w(r) does the (spatial) central field a(r) = w(r)r 
have sources only at the origin? Does it have curls? Investigate this also for a 
plane central field. Represent the solutions as gradient fields (gradients of scalar 
fields). (3 P) 


Problem 3.6 Let (A + k?) y(r) = 0. How can we prove that the three vector fields 
from Problem 3.3 satisfy the equation (A + k?) a(r) = 0? Note the sources and curls 
of the vector fields. (4 P) 


Problem 3.7 Determine the vector fields V (p - r/r?) and V x (r x p/r?) for con- 
stant p (dipole moment) when r Æ 0, and compare them. (5 P) 


Problem 3.8 Derive the singular behavior of the two vector fields for r = 0 from 
the volume integral of a sphere around the origin. Express the results in terms of the 
delta function. (8 P) 


Problem 3.9 Prove the representation of the Fourier transform of f (x) = g(x) h(x) 
as a convolution integral given on p. 22. (4 P) 


Problem 3.10 For fixed a, £, y (witha > 0, 8 > 0, and0 < y < r), a rectilinear 
oblique coordinate system x!, x? is given by the two equations x! = œ (x — y cot y) 
and x? = B y. Which functions y(x) describe the coordinate lines {x!, x7}? At what 
angle do the coordinate lines cross? How do the basic vectors g; = and g’ read as 
linear combinations of the Cartesian unit vectors? How do the fundamental tensors 
giz and g'* read? (7 P) 


268 3 Electromagnetism 


Problem 3.11 For spherical coordinates (r, 0, g), we have to introduce position- 
dependent unit vectors e,, eg, and e, in the direction of increasing coordinates. 
Decompose these three vectors in terms of ey, €,, and e,. Determine their partial 
derivatives with respect to r, 0, p and express them as multiples of the unit vectors 
€r, €g, Cy. (7 P) 


Problem 3.12 Determine the covariant and contravariant base vectors {g;} and {g'} 
as multiples of the unit vectors e; for spherical coordinates x! =r, x? = 0, and 
P=. (2 P) 


Problem 3.13 With the help the Maxwell construction, draw the force lines of two 
equally charged parallel lines with charges densities q / l and separated by a distance 
a. This uses the theorem that, for a source-free field, there is the same flux through 
any cross-section of a force tube. What changes with this construction for oppositely 
charged parallel lines, i.e., charge densities +q//, separated by a distance a? Why 
is the construction more precise than the method of drawing trajectories orthogonal 
to the equipotential lines? (8 P) 


Problem 3.14 Determine the equation f(x, z) = 0 of the field line of an ideal dipole 
p = pe; which lies at the origin (r’ = 0). Note that, due to the cylindrical symmetry, 
we may set y = 0. (4 P) 


Problem 3.15 On the z-axis there are several point charges q; at the positions z;. 
Determine their common potential ® by Taylor series expansion up to order (z;/r)*. 
Examine the result for the potential when r >> a (write ® as a multiple of qı = q) 
for: 


e a dipole (q1 = —q2, 21 = —22 = 34), 
e a linear quadrupole (qı = -iq = q3, Z1 = —23 = 4, Z2 = 0), and 
e an octupole (qı = —iqp = +393 = —q4, Z1 = 322 = —3z3 = —z4 = 3a)? 


Show that the field of a finite dipole may be written approximately as a superposition 
of a dipole field and an octupole field. How strong is the octupole field compared 
with the field of a pure quadrupole? Justify with the examples above that an ideal 
2”-pole can be viewed as a superposition of two 2”~!-poles. (8 P) 


Problem 3.16 Determine the potential and field strength of a hollow sphere with 
outer and inner radii R and 7R and a charge Q distributed evenly over its volume. 
Here 0 < ņ < 1, so a solid sphere has 7 = 0 and a surface charge n = 1. Sketch the 
results ® (r) and E (r) in the limiting cases n = 0 and n = 1. In these limiting cases, 
how much field energy is in the space with r < R? How much is in the external 
space? (7 P) 


Problem 3.17 Express the potential of a metal ring of radius R and charge Q in 
terms of the complete elliptic integral of the first kind K(m) (with 0 < m < 1) of 
p. 202. Here it will be convenient to replace the spherical coordinate g by x — 2x. 
Determine the potential and the field strength on the axis of the ring. (6 P) 


Problems 269 


Problem 3.18 Determine the potential and field strength on the axis of a thin metal 
disc of radius R and charge Q for constant charge density. What is the jump in the 
field strength at the disc? (3 P) 


Problem 3.19 What is obtained for the potential on the axis if the disc has a constant 
dipole density p4? What is the jump in the potential? (4 P) 


Problem 3.20 On a straight line at distance a from the origin, let there be a point 
charge q > 0, and at distance a’ on the same side of the origin, a charge —q’ < 0. 
For suitable g'(q, a, a’), the potential vanishes on the surface of a sphere about the 
origin. What is its radius? Use this to determine the charge density p4 on a grounded 
metal sphere of radius R induced by a point charge q at distance a from the center 
of the sphere. What changes for an ungrounded metal sphere? (6 P) 


Problem 3.21 How does the Maxwell stress tensor read for a homogeneous field 
of strength E = Fe, in vacuum? How strong is the force on a volume element 
dx dy dz? Using the stress tensor, determine the force on an area A if its normal is 
n = ey sind +e, cosé. 


Hint: Decompose the force into components along n, t = e, cos 0 — e; sin 0, and 
b = t x n. Draw the vectors E, n, t, and F for 6 = 30°. Interpret the result for 
opposite sides of a cube. (7 P) 


Problem 3.22 How does the stress tensor change at the x, y-plane if it carries the 
charge density p4 and is placed in an external (homogeneous) field in the z direction? 
Can the force on an enclosing layer be related to the mean value of the field strength 
above and below the plane? Determine the Cartesian components of the Maxwell 
stress tensor on the plane midway between two equal charges q (each at distance a 
from this plane)? What force is thus exerted from one of the sides on the plane? 


Hint: Express the strength of the field in cylindrical coordinates. (7 P) 


Problem 3.23 Determine the electric field around a metal sphere in a homogeneous 
electric field. Superpose the field of a dipole p on a suitable homogeneous field Eo 
in such a way that the tangential component of the total field vanishes on the surface 
of the sphere of radius r around the dipole. How large is the normal component (in 
particular in the direction of Ep, opposite and perpendicular to it)? (4 P) 


Problem 3.24 Determine the current density and resistance for half a metal ring 
with circular cross-section (area 1a”), whose axis forms a semi-circle of radius A 
(conductivity ø), if there is a voltage U between the faces.' Note the special case 
a & A. (5 P) 


‘Using the substitution t£ = tan at with t’ = Ja + t°) and cos x = (1 — t?) + t?), the integral 
of (1 +kcosx)~! for |k| < 1 can be transformed into the integral of 2/(1 — k)(K? + t)! with 
K? =(1+k)/(1 — k). This yields 2 (1 — k?)—!/2 arctan(t/K). 


270 3 Electromagnetism 


Problem 3.25 In an otherwise homogeneous conductor, there is a spherical void 
of radius ro containing air. Determine the current density j if it is equal to jọ for 
large r. (3 P) 


Problem 3.26 Equal currents 7 flow through two equal coaxial circles (radii R) a 
distance a apart. For which ratio a/R is the magnetic field strength at the center 
of the setup as homogeneous as possible? What does that mean? Where would we 
have to place a further pair of loop currents with radius iR in order to amplify 
the homogeneous field? Can the homogeneity be improved by a suitable choice of 
current strengths in two pairs of loops? (8 P) 


Problem 3.27 A closed iron ring with permeability u and dimensions a and A, as in 
Problem 3.24, is wrapped around N times with a thin wire. How large is the induction 
flux ® = f df - B in the ring? How large is the relative error 5 = |P — |/9, if 
we assume a constant magnetic field H equal to the value ® at the center of the 
cross-section? Determine ® and &® for N = 600, u = 500m9, A = 20 cm, zra? = 
10 cm’, and J = 1 A. The iron ring may have a narrow discontinuity (air gap) of 
width d. It can be so narrow that no field lines escape from the slit. How does the 
induction flux depend on the width d if we use a constant magnetic field H in the 
cross-section? (7 P) 


Problem 3.28 The mutual inductance of two coaxial circular rings of radii R and 
R’ a distance a apart is determined as L = uov RR’ {2 (K — E)—k°K}/k, with 
the parameter k? = 4R R' /{a? + (R+R’)*}, involving the complete elliptic func- 
tions of the first kind, viz., K(k?) as in Problem 3.17, and the second kind, viz., 
E(k?) = Pa á y 1—k? sin? z dz. What is obtained to leading order for L at very large 
distances (R < a, R' <a)? 


Hint: Expand K and E in powers of k. (3 P) 


Problem 3.29 In the limit of small distances (R ~ R’ >> a), we use the Landen 
transformation F(x|k°) = 2/1 +k) FE(x'ik" 2) for the incomplete elliptic integral of 
the first kind F(x|k?) = J} dz/V1—k? sin?z, with x’ = {x +arcsin(k sin x)} and 
k? = 4k/(1+k)*. 

With sin(2z; — z) =ksinz, we have cos(2z, — z) (2 dzı —dz) = k cos z dz, 
hence also dz{kcosz+ cos (2z, — z)} = 2dz, cos(2z, — z) = 2dz, (1 — k? 
sin? z)!/?. The square of the curly brackets is equal to k? cos? z + 2k cos z cos (2z; — 
z)+1—k sin? z, or again 1 + k? + 2k {cos z cos (2z; — z) — sin z sin (2z; — z)}. 
The curly bracket may be reformulated as cos 2z; = 1 — 2 sin? zı. Then 


dz/(1 — k? sin? z)? = 2dz1/{(1 + k)? — 4k sin? z1}? , 
which is important for the proof of Landen’s transformation. 


Prove that K(1 — £) © In(4/,/e). What follows for the inductance L(R, 
R’,a)? (5 P) 


Problems 271 


Fig. 3.36 Between two 
points of a circuit, a 
voltmeter is connected with 
thin (loss-free) wires 
(resistance Ro), such that the 
area A spanned by the circuit 
is divided in the ratio A,:A2 


Problem 3.30 Derive from this the self-inductance of a thin ring of wire with cir- 
cular cross-section (abbreviation as in Problem 3.24), which is composed of the 
mutual inductances L = (za?) f df, dfz Liz of its filaments. Here fọ In(A + 
B cos o) do = x In{5(A + V A? — B?)} for A > |B]. (For ferromagnetic materials, 
there is an additional term, not required here.) (5 P) 


Problem 3.31 For a current strength 7, determine the vector potential of a circular 
ring of radius Rọ at an arbitrary point r. The circular ring suggests using cylindrical 
coordinates (R, ø, z) with r = Reg + ze,. (6 P) 


Problem 3.32 A very long hollow cylinder with inner radius R;, outer radius R,, 
and permeability u is brought into a homogeneous magnetic field Hp perpendicular 
to its axis. Determine B and H for all r. How large is the field Hy compared to its 
value on the axis for u >> uo? (9 P) 


Problem 3.33 Perpendicular to the circuit shown in Fig. 3.36, made of a thin wire 
with resistance R = R; + R2, a homogeneous magnetic field changes by equal 
amounts in equal time intervals. What voltage does the voltmeter show, and in par- 
ticular, if the circuit forms a circle and the voltmeter sits at the center of the circle 
and is connected with straight wires? (5 P) 


Problem 3.34 An insulating cuboid (O<x<L,,0<y<Ly,0<z<L,) of 
homogeneous material with scalar permittivity and permeability is enclosed by ide- 
ally conducting walls. Investigate the following ansatz for the vector potential: 


Ay = dy COS(wt) cos(kxx + Pxx) cos(kyy + Pry) COS (kzz + Pxz) » 
Ay = ay cos(at) cos(kxx + Pyx) cos(kyy + Pyy) COS (kzz + Pyz) , 
A, = a, cos(wt) cos(kyx + x) cos(kyy oe Qzy) cos (k,2 + zz) 5 


with the radiation gauge. Can we restrict ourselves here to 0 < pix < x? What is the 
relation between w and k if all the Maxwell equations are valid? What requirements 
follow from the boundary conditions n x E = 0 and n - B = 0? (7 P) 


Problem 3.35 What requirement does the gauge condition V - A = 0 lead to for the 
ansatz above? What do we then obtain for the three fields A, E, and B? (5 P) 


272 3 Electromagnetism 


Problem 3.36 What do we obtain if k is parallel to one of the edges of the cuboid? 
What is the general ansatz for A in Problem 3.34? (3 P) 


Problem 3.37 Express the energy density w(t, r) of an electromagnetic wave in 
terms of its vector potential in the radiation gauge, i.e., with ® = 0 and V-A=0. 
How can Parseval’s equation help to re-express the total energy of the wave (inte- 
grated over the whole space) as an integral of the square of the absolute value of 
A(t, K) and dA/dt as weight factors? What is the unknown expression? (5 P) 


Problem 3.38 How does the electric field amplitude of the reflected and trans- 
mitted waves depend on the incoming amplitude in the limiting cases 6 = 0° and 
90° (expressed in terms of the refractive index n)? To what extent are the paral- 
lel and perpendicular components to be distinguished for perpendicular incidence 
(0 =0°)? (4 P) 


Problem 3.39 How large is the energy flux W, averaged over time, for an electro- 
magnetic wave with wave vector k passing through an area A perpendicular to k? 
What do we obtain for the reflected and the transmitted waves in the limiting cases 
investigated above? (4 P) 


Problem 3.40 Does the energy conservation law hold true for an electromagnetic 
wave, incident with the wave vector k on the interface between two homogeneous 
insulators (with an arbitrary angle of incidence)? Investigate this question for arbi- 
trary scalar material constants £ and u, i.e., also with u Æ uo. (5 P) 


Problem 3.41 For a homogeneous conductor (with scalar o, £, and u), derive the 
relation between w, E* - D, and H* - B from the Maxwell equations, if only one 
wave vector is given. How is the time average of the Poynting vector connected to 
the averaged energy density w? 


Hint: Use the approximation a? & Bx o/(2ew) > 1. (7 P) 


List of Symbols 


We stick closely to the recommendations of the International Union of Pure and 
Applied Physics (IUPAP) and the Deutsches Institut fiir Normung (DIN). These 
are listed in Symbole, Einheiten und Nomenklatur in der Physik (Physik-Verlag, 
Weinheim 1980) and are marked here with an asterisk. However, one and the same 
symbol may represent different quantities in different branches of physics. Therefore, 
we have to divide the list of symbols into different parts (Table 3.2). 


List of Symbols 


Table 3.2 Symbols used in electromagnetism 


273 


Symbol Name Page number 
* Q Charge 165 
q Point charge 165 
$ p (Space) Charge density 166 
a PA Surface charge density 167 
* I Current strength 186 
= j Current density 186 
ja Current density in a surface 195 
$ E Electric field strength 166 
% D Electric current density (displacement field) 174 
* B Magnetic current density (magnetic induction field) | 181 
* H Magnetic field strength 193 
* E Permittivity (dielectric constant) 176 
X €0 Electric field constant (vacuum permittivity) 164, 623 
* u Permeability 196 
* uo Magnetic field constant (vacuum permeability) 164, 623 
* c (co) Light velocity (in vacuum) 164, 216, 623 
* Xe Electric susceptibility 175 
* Xm Magnetic susceptibility 196 
x P Electric polarization 174 
* M Magnetization 191 
* p Electric dipole moment 171 
* m Magnetic dipole moment 190 
# U (Electric) Voltage 169 
b ® Electric potential 56 
k A Vector potential 197 
£ Epot Potential energy 169 
xd W Work 181 
s w Energy density 211 
. N Torque 171 
$ Cc Capacitance 179 
* R Electric resistance 187 
* o Electric conductivity 187 
af L Inductance 201 


(continued) 


274 3 Electromagnetism 


Table 3.2 (continued) 


Symbol Name Page number 
8 Ea Impedance 213 

S Poynting vector 211 

T Stress tensor 184, 215 

Fue Electromagnetic field tensor 240 


“The abbreviation ø is actually recommended for this, but it is also used also for the conductivity. 
The index A reminds us of an area. We also use it for the area divergence and area rotation. 

bg is actually recommended, but we use it for the azimuth. 

© V is needed for the volume. 

4The abbreviation A, common in mechanics, is needed here for the area. 

e M is recommended, but used here for the magnetization. 

Í L is recommended for the self-inductance. We also use this abbreviation for the mutual inductance. 
8 Z should be taken for the impedance, but @ stresses the fact that it is a complex quantity: (2 = 
R + iX, with resistance R and reactance) X 


References 


1. J.H. Hannay, Eur. J. Phys. 4, 141 (1983) 
2. I. Brevik, Phys. Rep. 52, 133 (1979) 
3. E.W. Schmid, G. Spitz, W. Lösch, Theoretical physics with the PC (Springer, Berlin, 1987) 


Suggestions for Textbooks and Further Reading 


4. H. Goldstein, J.L. ChP Poole, Safko, Classical Mechanics, 3rd edn. (Pearson, 2014) 
5. W. Greiner, Classical Electrodynamics (Springer, New York, 1998) 
6. J.D. Jackson, Classical Electromagnetism, 3rd edn. (Wiley, New York, 1998) 
7. L.D. Landau, E.M. Lifshitz, Course of Theoretical Physics Vol. 2 — The Classical Theory of 
Fields, 4th edn. (Butterworth-Heinemann, Oxford, 1975) 
8. L.D. Landau, E.M. Lifshitz, Course of Theoretical Physics, Vol. 8—Electrodynamics of Con- 
tinuous Media, 2nd edn. (Butterworth-Heinemann, Oxford, 1984) 
9. P. Lorrain, D. Corson, F. Lorrain, Electromagnetic Fields and Waves, 3rd edn. (W.H. Freeman, 
New York, 1988) 
10. W. Nolting, Theoretical Physics 3-Electrodynamics (Springer, Berlin, 2016) 
11. W. Nolting, Theoretical Physics 4-Special Theory of Relativity (Springer, Berlin, 2017) 
12. W.K.H. Panofsky, M. Phillips, Classical Electricity and Magnetism, 2nd edn. (Addison-Wesley, 
Reading, 1962) 
13. W. Rindler, Essential Relativity-Special, General, and Cosmological, revised, 2nd edn. 
(Springer, New York, 1977) 
14. F. Scheck, Classical Field Theory (Springer, Berlin, 2018) 
15. A. Sommerfeld, Lectures on Theoretical Physics 3-Electrodynamics (Academic, London, 1964) 
16. A. Sommerfeld, Lectures on Theoretical Physics 4-Optics (Academic, London, 1964) 
17. W. Thirring, Classical Mathematical Physics: Dynamical Systems and Field Theories, 3rd edn. 
(Springer, New York, 2013) 
18. A. Zangwill, Modern Electromagnetics (Cambridge University Press, 2013) 


Chapter 4 ®) 
Quantum Mechanics I coe fx 


4.1 Wave-—Particle Duality 


4.1.1 Heisenberg’s Uncertainty Relations 


A natural law is required to be true without exception: for all observers under equal 
conditions the same result should be obtained. However, “equal conditions” have to 
be reproducible and “identical results” can only be ensured within certain error limits. 
With N measurements, the experimental values x; in the statistical ensemble scatter 
around the average value x = x ee x; with an average error (for the individual 
measurement) 


a=- Fyr- P. 


We assume N >> 1 and hence may leave out the factor VN /(N — T) from p. 46. 
Here, x? is the average value of the squares of the experimental values. These notions 
have been explained in detail in Sect. 1.3. 

A basic feature of quantum physics is that canonically conjugate quantities cannot 
simultaneously have arbitrarily small error widths: the smaller the one, the larger 
the other. For example, the momentum p = 0L/d%* is canonically conjugate to the 
coordinate x* (see p. 99). Since Niels Bohr, such pairs of quantities have been referred 
to as complementary. 

In classical physics, this situation does not have the same relevance, even though 
there are complementary quantities, e.g., for the position x and wavenumber k = 
27 / of a wave group, we have Ax - Ak > 1/2. The inequality holds in particular for 
all pairs of quantities connected by Fourier transform. For Gaussian distributions, we 


© Springer Nature Switzerland AG 2018 275 
A. Lindner and D. Strauch, A Complete Course 

on Theoretical Physics, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-04360-5_4 


276 4 Quantum Mechanics I 


find Ax - Ak = 1/2 (Problem 4.1), and these have the smallest uncertainty product 
possible for complementary quantities, as will be shown later (p. 321). 

However, in classical physics it is often overlooked that canonically conjugate 
quantities are always complementary to each other, because there the basic error 
limits may be neglected in comparison to the average values. The situation is different 
in quantum theory: here the uncertainty relations are indispensable. Hence it must 
be a statistical theory, as only then do error widths make sense. 

According to Heisenberg, for canonically conjugate quantities like position and 
momentum, we have quantitatively 


Ax- Apr = sh 8k, ; 


with A = h/22. We use h to denote Planck’s action quantum, but nowadays in 
quantum theory it is more common to use h. This does not occur in the classical 
relation Ax - Ak > 1/2. According to de Broglie, p = hk (more on that on p. 319), 
so the two uncertainty relations are connected to each other. Note, however, that the 
uncertainty is sometimes defined differently, and then there is a different numerical 
factor in the uncertainty relations. Note also that Heisenberg [1] calls uncertain 
quantities “undetermined”, but that can be misunderstood. 

Thus we cannot, for example, produce an ensemble which is sharp (certain) in the 
position as well as in the momentum. If we force a ray with sharp direction through 
a narrow slit, in order to minimize the position uncertainty (perpendicular to the ray 
direction), then it spreads out because of the diffraction—and this all the more, the 
narrower the slit. The momentum orthogonal to the old direction of the ray can no 
longer be neglected and is unsharp (uncertain). (Its average value does not need to 
change, only the uncertainty.) By eliminating inappropriate parts of the position, we 
have changed the original ray. 

The uncertainty relations are thus already satisfied in the production of a sta- 
tistical ensemble. The uncertainties can often be attributed to the (then following) 
measurement, but after the measurement the observation is already finished. 

We start from the uncertainty relations as observational facts. As Heisenberg 
shows for many examples in the above-mentioned book, quantum phenomena only 
contradict our everyday experience if the uncertainty relations are not considered. 


4.1.2 Wave—Particle Dualism 


In order to solve Hamilton’s equations, unique initial values of position and momen- 
tum are necessary, but this requirement can only be satisfied within error limits, 
because of the uncertainty relations. Hence in quantum theory, we cannot apply the 
usual notion of determinism to processes—we can only predict how all possible 


4.1 Wave-—Particle Duality 277 


states will develop. Classically, we could assign probabilities to the possible orbits, 
and given what was said above, we should only actually try to find such probabilistic 
statements. 

However, with the probability distributions, interference now occurs, which is the 
classical proof that waves are involved. On the other hand, other experimental results 
involve shot noise (granularity) and hence support the idea that it is particles, and 
not waves, that are involved. 

This contradiction shows up clearly in the scattering of monochromatic electrons 
of sufficient energy from a crystal lattice. Here interference figures result on on the 
detector screen. This fact is taken as the classical proof of a wave-like nature. With 
decreasing radiation intensity, the strength of the detection on the screen is reduced, 
but not continuously—detections appear now here, now there, like shot noise: this 
is the classical proof of a particle-like nature. 

If electrons were classical particles, then they would hit the screen like grains of 
shot, and the intensity distribution p(r ) would result—without interference—as a 
sum of the n intensity distributions p,(r ) of the single scattering centers: 


P=} Pn (particle picture) . 


The function p(r ) describes the probability density of the strikes. If we exhaust all 
possibilities, then we should obtain 1, i.e., f dr p(r) = 1. Of course, for discrete 
possibilities there is a sum instead of an integral. 

If electrons were classical waves, then the intensity should decrease continuously, 
and we should not observe any granularity of the radiation. The intensity distribution 
p(r ) would not simply be the sum of the intensity distributions p,(r) of the scat- 
tering centers, but would show interference—we would have to work with complex 
amplitudes y,(r ), superpose them, and form the square of the absolute value of the 
sum: 


2 


=F Yn? + YO an Vin + Wm* Wn) (wave picture) . 


n<m 


D Yn 


n 


p= 


The mixed terms (2 Re $- Wn Wm) describe the interference (see Fig. 4.1). 


n<m 


4.1.3 Probability Waves 


Classically, the two pictures (or models), “wave” or “particle”, are mutually exclu- 
sive. The quantum theory can remove this contradiction: it contains both pictures as 
limiting cases, but restricts their range of application via the uncertainty relations in 
such a way that they agree with each other, as we shall soon show. 

In the above example, in particular, the point of impact of an individual electron 
cannot be predicted with certainty. Only the statistical ensemble exhibits equal results 


278 4 Quantum Mechanics I 


Fig. 4.1 Intensity distribution behind a single slit (dotted line) and a double slit (continuous line) 
according to the wave picture. All three slits have width b, and likewise the separation of the double 
slit. The interference can be explained only with the wave picture—in particular the zeros cannot 
be explained with the particle picture. For the single slit of width 2a, we have Ax = a/./6 and Ak 
is infinite! For this reason, only the part between the main minima is often considered. Then we 
obtain Ak = 1/a and hence Ax - Ak = 1/V6 


under equal conditions, and in particular, always the same interference figure. Only 
this impact probability—or more precisely probability density—actually obeys a 
law, and the theory should only make statements about that. With the observed 
interference, we need a wave theory. 

But it is important that a wave theory only applies for the impact probability, while 
the classical wave picture is based on “real” waves: it is extrinsic to a wave theory 
that the field quantity in the statistical ensemble fluctuates, whence deviations from 
the corresponding classical value may occur. But this statistical error is important 
for quantum theory: it allows the “granularity” of the radiation, which fits into the 
particle picture. This granularity remains unnoticed for large particle numbers [2, 
p. 4]: The all leveling law of large numbers masks completely the true nature of the 
single processes. On the other hand, for large numbers, uncertainties in the particle 
number barely show up. 

In order to capture the granularity for small particle numbers, we have to quantize 
the wave theory, that is, to take the intensities as natural multiples of a basic intensity. 
We treat field quantization in Sect.4.2.8, and in more detail in Sect. 5.3 on many- 
particle problems. (Incidentally, the so-called second quantization is nothing else 
than field quantization—a misleading name that can only be understood from the 


4.1 Wave-—Particle Duality 279 


historical perspective. There is in fact only one prescription for quantization, even 
though it may look different for fields and particles, because it does indeed produce 
the same result.) 

Here we must also consider the fact that the relative phases are important for 
the superposition of waves. Uncertainty in the phase suppresses the possibility of 
interference and allows only an incoherent superposition. In fact, as will be shown in 
Sect.4.2.9, there is also an uncertainty relation relating particle number and phase, 
although it cannot generally be written AN - A® > 1/2, because the phase makes 
sense only up to a multiple of 2x, and the particle number can only be positive- 
definite. 

We shall speak of particles, as is common practice, and assign probability waves 
to them. Occasionally, we shall speak of quanta, which like particles are natural 
multiples of an element and can interfere like waves. 

Our way to describe the interference of probability waves has already been indi- 
cated in the last section: we introduce probability amplitudes! These amplitudes are 
usually called wave functions. Several of them can interfere with each other—the 
amplitudes are added, and the square of the absolute value of the sum yields the prob- 
ability (or the probability density). Here, the interference phenomenon is expressed 
in the mixed terms. 

These rules are valid, however, only if the individual parts can interfere with each 
other. (It is well known that, in addition to coherent light, there is also incoherent 
light, which cannot interfere.) If the phase relations are destroyed by an external 
manipulation, for example, using light to observe the different paths of the partial 
waves, then it is no longer the probability amplitudes that are added, but only the 
probabilities. An “incoherent” superposition arises. For the moment, we restrict our- 
selves to (interfering) coherent superpositions, which are particularly important for 
the development of the usual quantum theory. We shall only deal with the general 
case at the end of the next section (Sect.4.2.11). This extends the applicability of 
quantum theory and is important, e.g., in thermodynamics. 

Only the probabilities (or probability densities) can be measured (observed), not 
the associated amplitudes. Hence also only their values and relative phases. In prin- 
ciple, a general phase factor exp(iġ) remains arbitrary. 


4.1.4 Pure States and Their Superposition (Superposition 
Principle) 


By probability we understand the ratio of actual events to the total number of the 
considered events taken all together. An ensemble with specific attributes (signatures) 
is investigated for its properties. A statistical ensemble with attributes is called a 


280 4 Quantum Mechanics I 


state in quantum theory. As already indicated, for the moment we shall not consider 
incoherent mixtures, but only the so-called pure states: 


Objects in the same state have the same properties and cannot be distinguished from each 
other; if they could be distinguished, then the state would not be characterized with sufficient 
precision. 


The objects here are intended to be representative of a class of objects. The notion 
of state serves in a statistical theory. To this end we need a filter which decides 
whether or not the property exists. Only experience can teach us which attributes 
are necessary for the complete characterization of a state. For a long time it was 
believed, for example, that there was only one species of neutrino, whereas three are 
now distinguished. On the other hand, the subdivision may also be too fine: for several 
years, the various decay channels of the kaon were assigned to different particles. 

As an example, a state is fully specified by the following statement: “There are 
electrons with polarization direction s and momentum p.” For the sake of simplicity, 
we shall momentarily take only p as a variable. Then the state is considered to be 
determined by the momentum alone. This is represented by the Dirac symbol |p), 
although actually we should write |electron, s, p). Instead of the momentum, we 
could also give the position r. The corresponding state is then characterized by |r). 
But according to the uncertainty relation we may not take r and p together, since 
they cannot be determined simultaneously with such accuracy. 

But we may wish to know the probability density of the state |p) at position r. 
For the corresponding probability amplitude, we write (r |p) in the Dirac notation. 
This complex number depends on r and p. 

As a further example, let us consider the double slit experiment. Here from Fig. 4.2 
we can introduce four different states |i), |1), |2), and |f}. Each refers to another 
position. Here the states |1) and |2) are constructed from |i) and | f) from |1) and |2). 


Fig. 4.2 Double slit experiment. The source generates the initial state | i) and the double slit selects 
the states |1) and |2). The final state | f) is detected. The path between | i) and | f) remains unknown 


4.1 Wave-—Particle Duality 281 


The corresponding probability amplitudes are clearly (1|i) and (2|i), or (f|1) 
and (f|2). The probability amplitude for the formation of | f) from | i) is composed 
as follows: 


(f£]i) = (£11) (11 i) + (£12) (2/4) , 
and the corresponding probability density is 
KELDI? = KELD LL a)? + 1 £12) 21i)? + 2ReC FIL)* (11 i) *(f [2) (21 i) - 


According to p. 277, we should add the amplitudes for the different paths and then 
take the square of the absolute value, whence each amplitude will factorize into the 
product of those amplitudes, to arrive at the slit from the initial state, or at the detector 
from the slit. 

The equation before last suggests that we imagine the initial state | 1) at the double 
slit as decomposed into two states |1} and |2): 


|i) = |DA i) + [2)(2] i) . 


This superposition of states with corresponding probability amplitudes makes good 
sense (superposition principle of states). Such superpositions can be understood 
classically with polarized light: we can decompose into either linearly or circularly 
polarized light (linearly polarized {| ||), | L)}, circularly polarized {|+), |—)}): 


Hy =lI A + 1+, 

=I) 5 +14, 
and conversely, 

ID= 54175. 

lD =+ +l 5. 


Instead of the four states |+), |—}, | I|), and | -L), we had the four unit vectors e4, e_, 
ej, and e1 on p. 219. 

We may take the states |...) as vectors and, according to the superposition prin- 
ciple, combine them linearly with complex coefficients. As already stressed in the 
discussion of vector algebra (see p. 3), it is an advantage to state the expansion basis 
first and then the coefficients. However, this is not true for the dual bra vectors in the 
next section. We would now like to set up the rules that will be applicable here. 


282 4 Quantum Mechanics I 


4.1.5 Hilbert Space (Four Axioms) 


So far we have denoted states using the Dirac symbol |...) and have written the 
attribute of the state, e.g., p, between these “brackets” as |p ). If the attributes are 
countable, then these symbols can be assigned to vectors in a Hilbert space. Here, we 
list the rules for these (proper) Hilbert vectors. However, for continuous attributes like 
p , improper Hilbert vectors are necessary, and we shall discuss these in Sect. 4.1.7. 

The algebraic and topological structure of the Hilbert space are determined by 
four axioms. However, in contrast to the usual vector algebra, we shall not restrict 
ourselves to three dimensions (or even finite dimensions, as is clearly expressed by 
the fourth axiom): 


1. The Hilbert space is a vector space over the field of complex numbers, i.e., its 


elements |Y), |g), ... can be added and multiplied by complex numbers a, b,..., 
where the usual rules of vector algebra apply: 


Iv) + lp) = lp) +1¥) . 
= 


(Iw) + 19)) + 1x) = Iv) + 9) + 1x). 
Iv) + lo) =|), 
Iv) (a +b) = |y)a+|y)b, 
(Iv) + lp) a= |y)a+|p)a. 


Occasionally, we will also write |y a + gb) for |y) a + |p) b. In particular, for each 
|v), there is a vector | — y) such that |W) + | — Y) = |o) is a “zero vector”. 

A finite set of vectors |w!),..., |W) are said to be linearly independent if 
none of them can be expressed in terms of the remaining ones, so for example, 
lyr") £ Xr iw |w”) cn. An infinite set of vectors are said to be linearly independent 
of each other if this is true for all finite subsets. N linearly independent vectors span 
a vector space of dimension N . The set of vectors in a one-dimensional Hilbert space 
forms a ray—they differ from each other only by a (complex) number. 

2. The Hilbert space has a Hermitian metric.! This means that a complex number 
(Yle) = (W\l@) is assigned to each pair of vectors |W), |g). This is called the scalar 
product, the inner product, or the Dirac bracket. It has the following properties: 


(Vlo +x) = (Vlo) + (lx), (vlipa) = (Wie) a, 
(Wig) = (gly) . (wily) > 0 ifly) # lo). 


It is linear, Hermitian-symmetric and positive-definite. In addition to (y|o) = 0 = 
(o|y), we thus have 


'Charles Hermite (1822-1901) was a French mathematician. Hence the “e” at the end is not pro- 
nounced. 


4.1 Wave-—Particle Duality 283 


(w+ lx) = (vix) + (glx), (Yalp) =a" (Yip). 


Therefore the scalar product (Y|) depends linearly on the ket-vectors |p), but anti- 
linearly on the bra-vectors (|. If we multiply the ket-vector by a number a, then 
likewise the inner product, but if we multiply the bra-vector by a, then the inner 
product gets multiplied by the conjugate complex a*. The two types of vectors are 
in principle different. They cannot be added, because they belong to “dual” spaces, 
but they can be assigned to each other. A bra-vector is known if its scalar products 
with all ket-vectors are known. Note that we also distinguish between covariant and 
contravariant components, and both are needed for a scalar product (see p. 33). Bra- 
and ket-vectors may also be considered as row and column vectors, respectively, if 
we allow complex components and do not restrict the dimension. 

The quantity || Y| = (YTY) = 0 is called the norm (length) of the vector |). 
It vanishes only if |y) is the zero vector. We will usually restrict ourselves to vectors 
of unit norm, whence we always have |(y|y) |? < 1, as we shall soon see, and that 
is indispensable for the probability interpretation. For this purpose, there are two 
further notions we need to consider. Two Hilbert vectors are said to be orthogonal to 
each other if their scalar product vanishes, and they are said to be parallel if the two 
vectors differ only by a numerical factor. From this we can obtain the components 
of the vector |y) parallel and orthogonal to the vector |g) from |Y) = |W) + |W), 
with |v) = |g) (glw)/(gl@) and |W.) = |W) — |W). Note that we do not assume 
here that |g) is normalized to 1, and |y) and |Y) are not generally, even if |y) is. 
Clearly, (ply) = (gly) and hence (gly) = (gla) — (gly) = 0. It then follows 
that || yr]? = yyl? + Iya l? with [yy ll? = Koly)? /Ilgll?, and because |y || > 0, 
we thus have || ||? > vy ||? and 


Schwarz inequality Iyi- lll = Ilo). 
In particular, |(|g)|* < 1 for vectors normalized to 1. Equality holds only if |y} and 
|) are parallel to each other and hence |y || vanishes. With |y + pl? = liy I|? + 
2Re(w|y) + ||g||* and Re(y|g) < |(wlg)|, Schwarz’s inequality also delivers || y + 
gl? < (vl + llpl)?. This upper limit is true also for |y — pl? = Ib — x) — 
(o — x)||?, where |x) can be an arbitrary vector. Hence, 


Triangle inequality |y — gl < IW — xli + lle— xl. 


For this reason, the norm ||y|| is also referred to as the length of the Hilbert vector 


lyr). 


Incidentally, from the Schwarz and the triangle inequalities together, it follows 


that IY lell = (wig) = [Re (ipl = 3 ve) + (plv). 


We can now investigate convergence in Hilbert space. However, we have to dis- 
tinguish between two notions of convergence: 


284 4 Quantum Mechanics I 


strong convergence |W") > |y) if lim |y” — yll = 0, 
n> 

weak convergence |y”) — |y) if lim (y”lo) = (yllo) for all |p}. 
n> 


With the help of the Schwarz inequality, strong convergence implies weak conver- 
gence. For all |g) (with ||g|| < oo), we find 


w"le) — iol =v" — Wile) < Iv" — wl lel > 0. 


But weak convergence does not imply strong convergence, unless we also have 


Iy” > IIE: 


ly” — WIP? = Nw"? + wi? — 2Re(w"ly) 
= Ww"? + WIP — 2Rely) = Iw"? — Iv. 


But there are sequences {|y”)} which converge weakly towards the zero vector 
without their norm tending towards zero, for example, if each |y”) is normalized to 
1 and orthogonal to all the others. These issues are investigated in Problem 4.9. 

A Cauchy sequence of vectors |y”) is understood as a sequence for which 
|y” — w’"|| becomes smaller than any £ > O with increasing n and m. Every strongly 
convergent sequence is a Cauchy sequence. Conversely, each Cauchy sequence con- 
verges strongly if the limit vector also belongs to the Hilbert space. This is taken care 
of by the third axiom. 


3. The Hilbert space is complete, in the sense that it contains all its accumulation 
points. For a finite-dimensional space, this is not in fact an additional requirement. 
The fourth axiom is then obsolete. 


4. The Hilbert space is of countably infinite dimension (separable), meaning that it 
contains only a countable infinity of mutually orthogonal unit vectors, {|¢,)} with 
(EnlEn) = Spy for all natural numbers n and n’. A system of such vectors is referred 
to as an orthonormal system. It consists of vectors which are orthogonal to each other 
and normalized to 1. We will write |n) for short, instead of |e,). 

The Hilbert space vectors are thus abbreviations for states and the scalar products 
(of vectors normalized to 1) for probability amplitudes. Then, e.g., (w|@) is the 
probability amplitude for finding |g) if the system is in the state |y). We shall 
determine such probability amplitudes later, e.g., (r |p) = h~>/? exp(ir - p/h). 


4.1.6 Representation of Hilbert Space Vectors 


Every arbitrary (normalizable) Hilbert vector |y} can be expanded in terms of a 
complete basis {|)}, where we assume an orthonormal system so that (n|n’) = Spy 
should hold: 


4.1 Wave-—Particle Duality 285 


v=o nay), Wl =o ln) (a. 


n 


The order of the factors, that is, expansion coefficients after ket-vectors and before 
bra-vectors, is actually arbitrary, but will turn out later to be particularly practical. 
For example, we shall treat )7,, |n)(n| as a unit operator and write equations like 
Iy) = 1|w) and (y| = (Ww |1. Then, according to p. 282, (w|n) = (n|w)* holds. The 
expansion coefficients 


Vn = (nly) 


are the (complex) vector components of |y) in this basis. The sequence {pn} gives 
the representation of the vector |y) in the basis {|7)}. For the scalar product of two 
vectors, it then follows that 


(lg) = > (Win)(nlg) = 2, wa" On 


described as insertion of intermediate states or insertion of unity (in Fig. 4.2, we used 
only two states |1) and |2)). The special case |v) = |W), viz., 


vl? = >> val, 
n=1 


is called the completeness relation. It holds only if no basis vector is missing. Finally, 


N 
wll? = do yal? 


n=l 


is Bessel ’s inequality. 

The Hilbert vectors which were initially introduced only formally thus become 
rather simple constructs as soon as a discrete basis is introduced in Hilbert space. 
Then each vector is given by its (possibly infinitely many) complex components 
with respect to this basis, i.e., by a sequence of complex numbers. We then speak of 
vectors in sequence space. 

If we take the sequence {(n|w) = Yn} as a column vector and {(w|n) = Yn*} as 
a row vector, i.e., 


yi 
w= (%) and (Wl = (až, Wo"... 


then the scalar products (W|) obey matrix multiplication rules. 


286 4 Quantum Mechanics I 


Of course, we may introduce a new basis, e.g., {|m)} with (m|m!) = dim. For the 
change of representation {|n)} — {|m)}, we clearly have 


Vn = (my) = (min) (nly) = Dm Wn, 


n 


i.e., the components in one basis follow from the components in the other basis, 
where the components of the basis vectors occur as expansion coefficients. This is 
similar to our procedure for the orthogonal transformation on p. 29, and will be 
important in Sect. 4.2 as a “unitary transformation”. 

We now take an obvious step which leads to a new representation of the Hilbert 
space. We plot the complex numbers Y, = (n|wW) versus the natural numbers {n} on 
the number axis, and then, not only to the natural numbers, but to all real numbers 
x, assign values w(x) = (x|W). This delivers a different representation of the Hilbert 
space, namely the Hilbert function space {w(x)}. It combines all complex functions 
defined on the real axis for which the square of their absolute value can be integrated in 
the Lebesgue sense, i.e., they are integrable almost everywhere, in the sense that only 
a set of arguments of measure zero need be excluded: (yw|yw) = f Iy Œ)? dx < 00. 
Such functions are said to be normalizable. With a finite numerical factor, they can 
be normalized to 1. The range of integration corresponds to the domain of definition, 
which can be infinite on both sides. This function space is a linear space. It is complete 
and has a countable infinity of dimensions. The inner product is now given by 


(Vlo) = [eee ar, 


i.e., the sum in sequence space becomes an integral in the function space. 
With this we can then express a complete orthonormal system of functions {g,(x)} 
(see p. 21) with f En“ (X) Sn (x) dx = by in the useful form 


8n(x) = (x|n) . 


An arbitrary (normalizable) function can be expanded in terms of this orthonormal 
system (represented in this basis): 


wea) = aly) = Do (ela) (al) = 2, Blt Vn 


n 


with the expansion coefficients (“Fourier coefficients”) 


‘es f inlx}trly) dx = f ant(x) W(x) de 


The best-known example is the Fourier expansion, and another the expansion in 
terms of Legendre polynomials, with domain of definition —1 < x < 1. 


4.1 Wave-—Particle Duality 287 


The function y(x) in Hilbert function space is then represented as a vector {Yn} 
in Hilbert sequence space, or the vector as a function. Depending on the basis, 
the same Hilbert vector |y) appears in different forms—in the sequence space, we 
obtain Heisenberg’s matrix mechanics, and in function space, Schrédinger’s wave 
mechanics. 


4.1.7 Improper Hilbert Vectors 


However, |x) and (x| are not genuine (proper) Hilbert vectors. If we compare 
the scalar product (Yl) = ow (Yin) (nin) (n'o) with the expected expression 
Sf (Wx) (xlx’) (xg) dx dx’, then f(x|x’) g(x’) dx’ must be equal to g(x). The scalar 
product (x|x’) is clearly equal to the Dirac delta function (see Sect. 1.1.10), i.e., 


(xx) = 5-2’), 


and hence is no longer a typical number—Dirac symbols with continuous variables 
are not proper Hilbert space vectors, but improper Hilbert vectors. The normalization 
to the delta function is called normalization in the continuum, and often also delta- 
normalization. 

Since x is a continuous variable, |(x|y)|- should not be called a probability: it is 
a probability density. Only | (x|y)|? dx is a probability, in particular, for the interval 
dx, and only probabilities can be compared with observed values. For example, there 
is no particle at the position r, but only in a region d*r around r. The more certain its 
position, the more uncertain its momentum! While we may often speak of a particle 
with the momentum p, e.g., in Sect.4.1.4, our main interest is not the “small error 
interval” Ap, otherwise its position would have to be totally uncertain. 

Continuous variables are often convenient for calculation, and we shall use them 
repeatedly, even if they are only ever observed in a certain interval. For the same 
reason we will not be disturbed by the fact that (x|x’) = 6(« — x’) is not a standard 
number (function of x and x’). It is quite sufficient that the delta-function has a definite 
meaning in an integral. 


| 2 


4.1.8 Summary: Wave—Particle Dualism 


Quantum mechanics is more general than its classical limiting case, since quantum 
theory includes the fact that canonically conjugate quantities cannot simultaneously 
be sharp (or certain)—they are complementary, in the sense that the more precise one 
quantity is, the less precise the other will be, a fact overlooked in classical mechanics. 

We take Heisenberg’s uncertainty relation Ax* - Apy > sh ôk, as the basic exper- 
imental fact. The consequences are far-reaching. In particular, the particle and wave 
pictures in quantum theory are no longer in contradiction, because all measurable 


288 4 Quantum Mechanics I 


quantities then have “uncertain” values in precisely such a way that the two pictures 
remain compatible with each other—neither the particle number nor the phase in the 
statistical ensemble has to have a sharp value. 

Uncertainty is a statistical notion and quantum theory a statistical theory for the 
determination of probabilities and average values. Since interference occurs for these 
probabilities, we work with probability amplitudes and take these as scalar products 
of Hilbert vectors. The ket-vector |...) specifies the attributes of the considered 
ensemble and the bra-vector (... | the attributes for the probability. Then the square 
of the absolute value of the scalar product (W |p) gives the probability for the attribute 
w in the ensemble g. The rules for these state vectors have been presented here. We 
assign proper or improper Hilbert space vectors to them, depending on whether they 
are valid for countable or continuous variables, respectively. 

Concerning the scalar product (W |g), initially only the square of the absolute value 
can be measured, i.e., as the associated probability. Only if two amplitudes interfere 
with each other can the relative phase be determined, and even then, a global phase 
factor remains free. 

Incidentally, as early as 1781, I. Kant wrote in his Kritik der reinen Vernunft: 
“[...] consequently we cannot have knowledge of a matter as a thing as such, but 
only as much as it is an object of the sensuous perception”, something Heisenberg 
also stressed. Only then can such knowledge be proven as a law, if the experiment 
is repeated. This leads to statistics. Then the uncertainty relations are valid from the 
moment the statistical ensemble has been produced, not at the time of the individual 
measurements. Anyone who does not take this fact into account will very likely find 
quantum theory incomplete. 


4.2 Operators and Observables 


4.2.1 Linear and Anti-linear Operators 


The state vectors |...) and (...| are mathematical tools to describe pure states in 
quantum theory. In addition, we need quantities which act on these state vectors, 
which we call operators. We always write them with upper-case letters: 


ly’) =A |Y). 


Operators assign an image vector |W") to each object vector |Y). (We can also 
consider operators which are only defined on a part of space, but we do not wish to 
deal with those here.) If we know the image vector |y’) for each vector |y}, then 
we also know the operator A, just as a bra-vector is determined if its scalar products 
with all ket-vectors are known. If A |Y} = A’ |W) for all |y), then the two operators 
are equal, i.e., A = A’. The zero operator assigns the zero vector to all vectors, i.e., 


4.2 Operators and Observables 289 


Oly) = |o), while the unit operator assigns the original vectors to all vectors, i.e., 
lly) = |W). 

In quantum mechanics, only linear and anti-linear operators occur. They are linear, 
if 


Aly +g) =Alw) +Alg) and Alya)=Aly)a, 


while for an anti-linear operator, A |y a) = (A |w)) a*. In quantum theory, there is 
only one important anti-linear operator, namely the time reversal operator 7, which 
we shall discuss in Sect.4.2.12 (and also the charge-inversion operator @ for the 
Dirac equation). Until then, we shall deal only with linear operators. They can be 
added and multiplied by complex numbers: 


(aA +bB) |p) =Alw)at+Blw)b. 


The product of two operators depends on the order of the factors: AB may differ from 
BA. We define the commutator and anti-commutator of two operators A and B by 


[A, B] commutator of A and B, 
B 


]+ = {A, B} anti-commutator of A and B. 


IfA B = BA holds, the two operators commute with each other. Then also aA and bB 
commute with each other. The unit operator and the zero operator commute with all 
operators. In quantum theory, it is important to know whether or not two operators 
commute with each other, so here are several properties of commutators: 


[A, B] = —[B, A] , 


[A, B + C] = [A, B]+[A,C], 
[A, BC] = [A, B] C + B[A, C]. 


Hence, with [A, B”] = [A, B]B”"-! + B[A, B"-!] forn € {1, 2, ...}, it follows that 


n—1 
[A, B"] = Pe [A, B] B"! . 
k=0 


In particular, for [[A, B], B] = 0, we have [A, B”] = n[A, B] B”-!. The last expres- 


sion is sometimes written as [A, B] dB”/dB, because we can also differentiate with 
respect to operators, as we shall see on p. 316. In addition, we find Jacobi’s identity 


[A, [B, C]] + [B, [C, A]] + [C, [A, B]] = 0, 


290 4 Quantum Mechanics I 


as for the vector product (see p. 4) and the Poisson brackets (p. 124). Note that, 
together with skew-symmetry [A, B] = —[B, A] and the bilinearity property 


[aA + bB, C] = a[A, C] + D[B, C], 
in the abstract vector space of the quantities A, B, C,..., the Jacobi identity shows 


that the commutator makes the set of operators into a Lie algebra. 
There are functions of operators, e.g., polynomials in A like 


fA) = col +c A+A? H-H nA" ; 
and the “exponential function” expA = e4 = ea »A"/(n!). Incidentally, 


[A,B] _ [A, 1A, BI] pi [A, [A, [A, BII] m 


A -A _ 
e Be“ = B+ 7 + 71 3] 


is called the Hausdorff series. This equation can be proven by considering the function 
f(t) = e“ Be™“, for which f = [A, f], f = [A, f], and similar. With f (0) = B, its 
Taylor series about t = 0 for t = 1 delivers the Hausdorff series. From [A, 1] = 0, it 


follows in particular that e4e~4 = 1, which can be generalized: 
[A, A B= BIAB] => ete? = eZ Bl A+B, 


In particular, if we set f (t) = e“ e®", then f = Af + fB = (A + e“ Be’) f, and 
with [A, [A, B]] = 0, according to Hausdorff, f = (A + B + [A, B] £) f. With f (0) = 
1 this implies f (t) = exp{ (A + B) t + HA, B] £}. The claim follows for t = 1, since 
exp(5[A, B]) may be factored out, because [A, B] is assumed to commute with A 
and B. 

Numerical factors multiplying Hilbert vectors can be considered as very simple 
operators. They are multiples of the unit operator and hence commute with every 
linear operator. 


4.2.2 Matrix Elements and Representation of Linear 
Operators 


So far the operators have been acting only on ket-vectors |.. .). We consider now the 
scalar product of an arbitrary bra-vector (W| with the ket-vectors A |g), where A is a 
linear operator. Each scalar product depends linearly on its ket-vector, but now the 
ket-vector A |g) also depends linearly on |g). Thus the scalar product depends linearly 
on |g). Consequently, a bra-vector can be constructed from the other quantities (4| 
and A. Hence, for linear operators, we have 


4.2 Operators and Observables 291 


(Wl Alp) = (WIA) lp) = (WIA lp). 


This complex number is called the matrix element of the operator A between the 
States (Y| and |g). 

In order to understand the connection with matrices, we take any discrete com- 
plete orthonormal system {|n)} and consider A |y) = $ „ Aln’)(n'|y) and A |n’) = 
>, 12) (n| A |n’). If we compare this with the original expression, we have 


A=} |n)(njA nn. 


The complex numbers (n| A |n’) form the matrix of the operator A in the n represen- 
tation (possibly with infinitely many matrix elements): 


(nil A |n) (m| A |m}... 
(m| A |n) (m| A |m)... 


If the matrix is known, then so is the operator “in the n representation”. Its “diagonal 
elements” are (n| A |n) and its “off-diagonal elements” (n| A |n’) (with n 4 n’). 

In the Dirac notation, (...|...) is thus a number and |...}(...| an operator. A 
particularly important example is the unit operator 


1= Y% n(n, with (a) 1|) = (nn) = ôw - 


In this sense, the above-mentioned representations |y) = $, |n)(n|w) for states 
and A = } „p |n)(n|A|n’)(n'| for operators are to be interpreted as |y) = 1|w) and 
A = 1A1. This also shows why the notation (n|A|n’) is preferred over the abbreviation 
Ann’, even though the symbol takes up more room. 

We can take the objects |”) (n| as operators projecting onto the states |n}. In 
particular, if |Y} is normalized to 1, the projection operator onto the state |y) is 


Py = |w)(wl, with (n| Py n) = (nl) (pln) = vn Yn* , 
because this operator projects an arbitrary vector |g) onto the vector |): 


Py lg) = [Y (Wip) . 


For ||y|| = 1, we always have Py? = P,, even though Py 4 1 (and Æ 0) holds: 
projection operators are idempotent. 
For the operator product A B, we have 


292 4 Quantum Mechanics I 


(nJ AB |n’) = $ (nA |n’)(n!"| Bin’) , 


n" 


and hence the usual law of matrix multiplication. 


4.2.3 Associated Operators 


If the product of two operators is equal to the unit operator, as exemplified by e4 and 
e~4 above, each operator is said to be the inverse of the other: 


AA'=1=A7'A. 


But note that not every operator has an inverse. For singular operators, there is a 
|v) A |o) with A |y) = Jo), and from |o) there is no operator leading back to |). 
For operator products, we have 


(AB)! = B! A`! , 


because their product with AB gives 1 in both cases. 
What the operator A produces for ket-vectors, its (Hermitian) adjoint operator At 
does for bra-vectors: 


Alp =y) = (A = (y. 
Hence we always have (y| AÏ |o) = (y'i) = (g|W’)* = (o| A |y)", together with 
AŤ =A and (ABY =B A, 


since (y|(AB)'|g) = (p|AB|y)* = 3°, (plAln)* (a|BIy)* = (W|BTA" lp) for all (y| 
and |g). For real matrices, the adjoint is the same as the transpose (reflected in the 
main diagonal). Incidentally, (A‘)~'! = (A~!)* holds, since A‘(A7~!)* = (A7!A)* = 
1* = 1. Instead of “Hermitian adjoint”, we usually speak of the Hermitian conjugate 
and abbreviate this to h.c., e.g., A + A’ is the same as A+ h.c. 

An operator would be called self-adjoint if 


A =A, ie, (WIAlp) = (plAly)", for all |y) and |g) , 


which is true, e.g., for any projection operator. Such operators are also said to be 
Hermitian, even though the domains of definition of A and AŤ for a Hermitian operator 
do not have to coincide. Hence, all self-adjoint operators are Hermitian, but not all 
Hermitian operators are self-adjoint. Note also that Hermitian operators always have 
real diagonal elements. If all the matrix elements are real, as for the tensor of inertia, 
then we speak of a symmetric matrix rather than a Hermitian matrix. The product 


4.2 Operators and Observables 293 


of two Hermitian operators is only Hermitian if they commute with each other. But 
{A, B} and [A, B]/i are Hermitian, so we can use 


AB+BA _AB-—BA {A, B} _ [A, B] 
=r : = +1 ae 
2 2i 2 2i 


AB 


if AB is not Hermitian. 
An operator is said to be unitary, if 


U=U! «> UU=l1=U'U, 


whence „(n| U |n} (n"| U |n)* = (n'|n”) = $, (n| U |n')* (n| U |n"). If the matrix 
is unitary and real, like the rotation matrix on p. 29, then it is also said to be orthogonal. 
Note that any unitary 2 x 2 matrix can be obtained from 3 real parameters a, $, y, 
in particular from 


cosg exp(i6) sina exp(—iy) 
U = 7 3 : ? 
—sina exp(iy) cosa exp(—if) 


if we disregard a common phase factor, hence a fourth parameter. The inverse of a 2 
x 2 matrix is given on p. 71. 
Unitary operators U can be derived from self-adjoint operators A via 


U = exp(iA) , 


since, according to the last section, for A = A‘, we find 
: oe ley 2 
UU' = exp(iA — iA) exp(5[iA, —iA]) = 1. 


For infinitesimal transformations (with A= At <1), the approximation 
exp(+iA) © 1 + iA is often used. A different relation between a Hermitian and 
a unitary operator is produced by U = (1 — iA)(1 +iA)~!. For the proof, we use the 
fact that the factors commute. 

If all vectors are subjected to a unitary transformation U, then their scalar products 
remain the same: 


(Wig) = (wlUTU |g) = (vlo) , 


in particular all vectors keep the same norm. Unitary transformations are thus isomet- 
ric. Here, only U UYU = lis necessary, but not UU t = 1. If U is isometric, then UU* 
is a projection operator, since (UU)? is then equal to UU". For a finite dimension, 
unitarity follows from isometry. 

With a unitary operator U, a complete orthonormal system {|m)} can be 
transformed into a different basis {|7) = U |n)}. A change of representation always 


294 4 Quantum Mechanics I 


corresponds to a unitary transformation. If the vectors are transformed with |y’) = 
U |w), then the operators are likewise transformed with 


A’ = UAU! = UAU}, 


since |y) = A |y) and U |p) = |g’) = A’ |Y") = A'U |Y) implies UA = A’U. Corre- 
spondingly, the operator function f (A) turns into U f (A) UT! = f (UAU~!), since 
all f (A) are to be taken as power series of A, and the unit operator U-'U may be 
inserted between the individual factors. 

The trace of an operator, i.e., the sum of its diagonal elements, always remains 
constant under unitary transformations in finite dimensions, since then tr(AB) = 
tr(BA) and hence tr(UAU—') = trA. For infinite dimensions, this is not always true, 
as we shall see for a counterexample on p. 303. 

If A and B commute, then the operator B remains the same for the transformation 
U = exp(iA), as follows immediately from the Hausdorff series (p. 290). Here, U 
does not need to be unitary. This is only necessary if, for self-adjoint A, we also 
require A’ to be self-adjoint, because A’ = UAU! implies A = U~!*ATU*, and 
this is equal to A’ if UT! = UŻ. 


4.2.4 Eigenvalues and Eigenvectors 


These notions are defined as follows (see p. 87). If 
Ala) =|a)a, 


then a is an eigenvalue and |æ) (# |o)) an eigenvector of the operator A. Only the 
ray specified by |æ} is important. For linear operators, |œ} is an eigenvector with 
eigenvalue a if and only if |@)c is an eigenvector with eigenvalue a. Then also 
(«| At = a* {œ| holds, since (a| AŤ |B) = (B| A |a)* = (Bla)* a* = a* (a|B), for all 
|B). While the eigenvalues have physical relevance, the state vectors |...) are only 
mathematical tools. For discrete eigenvalues a,, the number n is called the quantum 
number, e.g., we speak of the oscillation, angular momentum, direction, and principal 
quantum numbers in various contexts. 

The transformed operator UA U~! has the same eigenvalue a and the eigenvector 
U |æ}. From the equation above, it follows in particular that VAUT! U|a) = UA|a) = 
U|a) a. 

An important claim is that Hermitian operators have real eigenvalues. In partic- 
ular, if A = At, then the left-hand side of (œ| A |æ) = (a|@) a is real. On the right, 
the factor (a|@) is real, and so therefore is the eigenvalue a. 

Unitary operators have only eigenvalues of absolute value 1. If in particular 
AtA = 1, then we have (ala) = (a|A‘A |a) = a* (ala) a = |a|? (ææ), so |a|? = 1. 

If two eigenvectors of a Hermitian operator A = A‘ belong to different eigen- 
values, then those eigenvectors are orthogonal to each other. This is because 


4.2 Operators and Observables 295 


A |Om) = on) an implies 0 = (a}|A* — A |az) = (a, — az) (aj |), with a; # az, so 
(œ |œ ) must vanish, as required. In fact, we have already shown this for the principal 
axes of the inertia tensor on p. 88. If all eigenvalues a, are different, then we may 
take the normalized eigenvectors |a,) as an expansion basis in Hilbert space, since 
they then form a complete orthonormal system, and A will then be diagonal: 


(non) = bm» 1= Do lon) (Onl, and A=} læn) an (Onl. 


n n 


For this reason, the determination of the eigenvalues and eigenvectors of an operator 
is referred to as determining the eigen-representation (or diagonalization) of the 
operator—it corresponds to a unitary transformation to a more convenient expansion 
basis for the operator (which gets along without a double sum). For the sum and the 
product of the eigenvalues, no transformation is necessary, since that changes neither 
trace nor determinant. 

However, an operator can have several linearly independent eigenvectors corre- 
sponding to the same eigenvalue, e.g., the unit operator has only the eigenvalue 1. We 
then speak of degeneracy: if there are in total N linearly independent eigenvectors 
with the same eigenvalue, it is said to be N -fold degenerate. Then N orthonormalized 
eigenvectors |œ„) can be chosen as basis vectors with this eigenvalue. This is what 
happens in mechanics when we seek the principal moments of inertia. 

The eigenvectors |œ„) of an operator A also diagonalize the powers A* of the 
operator A and the operator functions f (A): 


FA) = Do lon) f (an) (nl . 


The special case A! = X, (Œn) dn! (æn| shows that none of the eigenvalues can 
be zero if the inverse exists. If a, = 0 for some n, A is singular. 

All functions of the same operator thus have the same eigenvectors, while their 
eigenvalue spectra can differ. They also commute with each other. Generally, the 
following claim is true: two operators A and B commute with each other if and 
only if they share a complete orthonormal system of eigenvectors. If they have 
only common eigenvectors, then they commute, because the order of any prod- 
uct of their eigenvalues is of no importance: AB = X, [Œn)anbn(æn| is equal to 
BA = }„ |&n)bnan(@n|. On the other hand, if initially only A = }_`„ lon) an (æn| is 
given with 1 = }_,„ |æn) (&œn|, then we have 


AB = X |æn) an (tn| Blatn) (nl , 


nn’ 


BA = X |an) (@n] B lot) aw (ty | « 


nn! 


296 4 Quantum Mechanics I 


From A B — BA = 0, we deduce that (a@,,| Bay) (an — aw) = 0, because the zero 
operator in each basis has only zeros as matrix elements. If there is no degeneracy, 
then a, Æ a, holds for all n Æ n’ and hence (a,| B |æw) is diagonal. Then each |œ, ) 
is also an eigenvector of B. But if eigenvalues a, are degenerate, then one can make 
use of the freedom in the choice of the basis vectors to diagonalize the matrix B. 
When [A, B] = 0, there is thus always a complete system of eigenvectors for both 
operators. 

If an operator has degenerate eigenvalues, we must search for further operators 
which commute with it and lift the degeneracy. Then we can denote the eigenvectors 
by the associated eigenvalues, e.g., |.) = |an, bn, ...). Here we may leave out the 
index n on the right-hand side and write for short |a, b, .. .). Each eigenvector differs 
from the others by the order of the values. If there is no degeneracy for A, then the 
notation |a) suffices. Hence we write in the following 


Ala)=|a)a, with (ala’) = ôa and > la){a|=1. 


Here a is assumed to be discrete. But the operator may also have a continuous 
eigenvalue spectrum, or even some discrete and some continuous eigenvalues, as 
happens for the Hamilton operator of the hydrogen atom. For continuous eigenvalues, 
we have A = f |a) a {a| da with (ala’) = 5(a — a’) and f |a) (a| da = 1. Then sums 
have to be replaced by integrals and Kronecker symbols by delta functions. 

If a Hermitian operator depends on a parameter À, e.g., A(A) = >>, cn(A) X”, 
then so do its eigenstates and eigenvalues. For the eigenvalues, we then have the 
Hellmann—Feynman theorem: 


ðA ða 
A = == (AV = == 5 
la)=|agha = (al ah |a) a1 
For the proof we differentiate (a| A — a |a) = 0 with respect to à and make use of A 
being Hermitian: 


ðA ða ða 
— — — 2Re(— |A — =0. 
{a| T ja) + e (z | a ja) 


This suffices for the proof because (A — a) |a) = 0. The theorem is mainly applied to 
the Hamilton operator, but we may use it also for other observables. This is connected 
to the adiabatic theorem. If the Hamilton operator H(t) varies sufficiently slowly 
with time, then a system initially in an eigenstate of H (tọ) remains in the eigenstate 
developing from it, provided that it always remains non-degenerate. This will be 
demonstrated in Fig. 4.11 on p. 348 for the time-dependent oscillator. 


Before these two authors, it was already formulated by Giittinger [3] in his diploma thesis. 


4.2 Operators and Observables 297 


4.2.5 Expansion in Terms of a Basis of Orthogonal Operators 


Two operators A and B are said to be orthogonal to each other if the trace of AB 
vanishes. For matrices of finite dimension, the order of the factors is not important 
and nor is it important which factor is chosen as adjoint, since tr(A‘B) is then equal 
to tr(BA') = {tr(A'B)"}* and (A‘B)' = B'A. In particular, tr(A‘A) is real, and it is 
also non-negative, since tr(A‘A) = Yan (nalAln’) 7. 

We can introduce an orthogonal system of operators C, as a common expansion 
basis for all operators. If we take Hermitian operators (C, t= Cn), then that simplifies 
the considerations even further, but we shall not do so yet. Thus we only require, for 
all n, n’, 


tr(Cnt Cw) = c ôn . 


Here, c = c* > 0 is a normalization factor, which we can choose at our convenience. 
c = 2 is often chosen, e.g., for the Pauli matrices in Sect. 4.2.10 and their generaliza- 
tions to more than two dimensions, the Gell-Mann matrix. But occasionally, c = 1 
is also chosen. In fact, it can also depend on n = n’, but we shall not pursue this any 
further here. 

If the basis {C,,} is complete, then for arbitrary operators A, we have 


tr(C,'A F : 
A= MS | with ECA) = GA . 
c 


n 


For a Hermitian basis {C," = Cn}, all Hermitian operators have real expansion coef- 
ficients. 

In an N-dimensional vector space, we need N? basis operators, one for each matrix 
element. But they would all also commute with each other. This is no longer true for 
our general basis. Nevertheless, their commutators can be expanded: 


: tra C,' [Cw, Cw]) 
1 [Cr Cw] = Ca : 
2 : 
If the basis consists of Hermitian operators, then on the right there are only real expan- 
sion coefficients, the so-called structure constants of the associated Lie algebra (see 
[8]), which are antisymmetric in the three indices: symmetric for cyclic permutations 
since tr(C,[Cyw, Cwl) = tr(Cw [Cw , Cr]), and antisymmetric for anti-cyclic permu- 
tations since tr(C,[Cy, Cw) = —tr(Cy[Cn, Cy]). Unitary transformations do not 
change that. 
It is advantageous to start the basis with the unit operator, Co œ 1, because this 

operator commutes with all other operators and only its trace is non-zero, since the 
other operators should be orthogonal to it: trC, « dn. 


298 4 Quantum Mechanics I 


A first example will be presented in Sect.4.2.10. In a two-dimensional vector 
space, the Pauli operators are useful as an expansion basis. With the Wigner function 
(Sect. 4.3.5), we also employ an operator basis. 


4.2.6 Observables. Basic Assumptions 


In the above, we have provided the mathematical tools, and now we turn to physics 
again. We start with basics. So far we have assumed only that (pure) states can be 
represented by proper or improper Hilbert vectors and that the scalar product (Y|) 
yields the probability amplitude for the state |y) to be contained in the state |p). 
Now we add the following: 


To every measurable quantity (an observable, e.g., position, momentum, energy, 
angular momentum) is assigned a Hermitian operator. Its real eigenvalues are 
equal to all possible measurable results of this observable. 


If the statistical ensemble is in an eigenstate, then the associated eigenvalue is always 
measured: the measured result is sharp. And conversely, if the same value is always 
measured, then it is in this state. In contrast, if the ensemble is not in an eigenstate 
of the measurable operator, then the measured results scatter about the average value 
with a non-zero uncertainty. 

For dynamical variables, we may take only Hermitian operators, because only 
they have real eigenvalues; and measured results are real quantities. For a complex 
quantity, we would have to measure two numbers. As possible measured results for 
the dynamical variable A, only the eigenvalues {a} of the assigned operator A occur. 
This is the physical meaning of the eigenvalues. Furthermore, the orthogonality of 
two eigenstates can be interpreted physically: the two states always deliver different 
experimental values. But note that, for degenerate states, we have to consider further 
properties. 

If the system ensemble is in the state |a), the measured results for all variables 
f(A) are fixed, namely, f (a). In contrast, for all other quantities B with [A, B] 4 0 
in the statistical ensemble, generally different values b will be measured. If B does 
not commute with A, then in most cases |a) cannot be represented by a single eigen- 
vector of B. Then the state |a) = X, |b) (bla) can only be decomposed into several 
eigenstates |b) of B. This is the physical relevance of the superposition principle. 

If the system is prepared in the state |y), then generally different values for the 
variable A are measured—except for the case where |y) is an eigenstate of A. We 
consider now the average value, and in the next section the uncertainty. For the 
average value, we have to weight the possible measured results a with the associated 
probabilities |(a|y)|*. Since we measure the value a with the probability |(a|y)|? 
and the value a’ with the probability |(a’|w)|?, 


4.2 Operators and Observables 299 
A=) aly)? a= > (hla) a (alp) = (WIA lY) = (A). 


Instead of the average value A, we also call it the expectation value (A). The matrix 
element (y| A |y) delivers the expectation value for the observable A in the state |y). 
The expectation value is determined from the set of possible experimental values. For 
discrete eigenvalues, it may definitely differ from all possible experimental values, 
e.g., it may lie between the n th and (n + 1) th level. 

In the real-space representation {|r)}, we have correspondingly 


(WIAly) = [rer (wir )(r Alr Iy), 


with (r'|w) = y(r’) and (Wir) = w*(r), according to Sect.4.1.6. In most cases, 
we have to deal with local operators. These are diagonal in the real-space represen- 
tation, so the double integral becomes a single one. For local operators, we thus have 
(WIAly) = [Pr WO)A@) Wa) = far |WO)P A). 

The general matrix element (Y| A |g) with Y Æ g cannot be interpreted classically 
for three reasons: it depends on two states, it is a complex number, and it involves 
(like (y| and |g)) an arbitrary phase factor. In quantum theory, we deal with the 
transition amplitude from |p) with A to (Y|. Note that it is important to get used to 
reading the expressions in quantum theory from right to left: the operator A acts on 
the ket-vector and only then is the probability amplitude of this new ket-vector with 
the bra-vector of importance. These difficulties with the classical meaning do not 
occur for the diagonal elements (y| A |y): because A is Hermitian, it is real, and if 
|v) is multiplied by exp(i@), then likewise (y| is multiplied by exp(—i¢@). 


4.2.7 Uncertainty 


If the system in an eigenstate of the considered measurable quantity A, then the 
measured result is known sharply (with certainty) and AA = 0. Otherwise, different 
experimental values occur with their corresponding probabilities. Nevertheless, the 
average value A of the experimental values is equal to the expectation value (y| A |y), 
and also A? = (Y| A? |Y) is known. Hence, according to p. 275, the uncertainty is 
also known: 


AA = V (WIA? |W) — (YIA ly}? . 


It only vanishes if |y} is an eigenstate of A. Otherwise we have A? > A?. In particular, 
if we take a basis with |y) as the first vector, then for Hermitian operators A, we 
have 


300 4 Quantum Mechanics I 


(WIA? lh) = (WIA)? + AATA) + 
=(WIAIWP +A +e. 
If |Y) is not an eigenstate of A, the first term is not the only one to contribute. We 
then have AA > 0. 
For the uncertainty relation, we consider two Hermitian operators A and B. Then 
with |æ) = (A — A) |y) and |8) = (B — B) |y), and because (AA)? = (y| (A — 
A)? |Y) = |la||? and (AB)? = ||6||?, we obtain for || yr] = 1, 


(AA)? - (AB)* = llall71BII? = Kalb)? = (A — A)(B- B) IW)’, 


where we have used Schwarz’s inequality (see p. 283). For Hermitian operators 
aaa 2 2 

C and D, according to p. 293, we now have | CD |? = }{C, D} + 4[C, D] . With 

{A — A, B — B} = {A, B} — 2A Band[A — A, B — B] = [A, B], it thus follows that 


(AA) - (AB)? > (| 5 {A, B}—AB |y)? + (Wl FIA, BI |W)’. 


If the operators A and B do not commute with each other, but if their commutator 
is equal to the unit operator up to an imaginary constant, the last term contributes 
positively and the two quantities A and B cannot both be sharp. 

Heisenberg’s uncertainty relation for canonically conjugate quantities A and B, 
viZ., 


AA- AB > sh, 


1 
2 
can thus be guaranteed with non-commuting operators. They only have to obey the 
requirement 


[A, B] =ih1. 


According to Born and Jordan, we can require this of all canonically conjugate 
quantities and this is the very reason why we actually deal with operators and Hilbert 
vectors. In the next section, we shall point out connections with these commutation 
relations. 

There are two conditions under which the product of the uncertainties AA - AB 
is as small as possible. Firstly, we must have + AB + BA=AB, or AB — 5 [A, B] = 


A B, so that (A — A) B = 5 [A, B]. Secondly, according to p. 283, only if the consid- 
ered vectors |œ) and |£} are parallel to each other does Schwarz’s inequality become 
an equation, i.e., if 


(A— A) |v) =à (B-B) |y). 


4.2 Operators and Observables 301 


But then also 4[A, B] = (w|(A — A) Bly) = à* (B — B) B = 2* (AB). Here, 
according to the initial equation in the considered extreme case, we have AA AB = 
+(5 [A, B]), where the left-hand expression is > 0 and the one on the right fixes the 
sign. In short then, AA AB = + iA*(AB)?, or 


A _ AA 
= Fi —. 
7 AB 


For canonically conjugate quantities A and B with [A, B] = iħ 1, we have to take the 


upper sign, and for i[A, B] > 0, the lower sign. 


4.2.8 Field Operators 


Once again, we turn to the wave-particle duality and here restrict ourselves to (many) 
“quanta in the same state”, e.g., with equal momentum. So the following considera- 
tions apply only to bosons, but not fermions, e.g., not electrons, because according 
to the Pauli principle only one fermion may occupy a given state. The discussion 
here will be useful later for the harmonic oscillator (Sect. 4.5.4), where the transition 
to neighboring states is always connected with an oscillation quantum of the same 
energy. Note that sound quanta are also called phonons, and light quanta photons. 

The Dirac symbol |n) will now be used to indicate that there are n particles. The 
numbers n € {0, 1, 2,...} are the eigenvalues of the number operator N and |n} its 
eigenstates, which we shall investigate now in some detail. To this end, we introduce 
(non-Hermitian) creation and annihilation operators: 


Wi \n) x Intl) — > Win) « |n—1). 


Note that, in many textbooks on quantum mechanics, and also according to the 
IUPAP recommendations, a or b is used instead of Y, which is common practice in 
field theory though, and indeed this operator has something to do with the state |y). 
Y|) results in particular in the vacuum, as we shall soon show. Instead of the state 
|v), we may also speak of the field |Y), if we think of its real-space representation 
(r|w) = y(r ). Since negative eigenvalues n may not occur, Y|0) has to deliver the 
zero vector |o}. Note, however, that |0) is not the zero vector |o), but the state with 
n = 0. If n gives the number of “particles”, then |0} is the state without particles, the 
“vacuum”, for which (0|0) = 1, in contrast to (o|o) = 0. 
Both YW" and YÏ W therefore have the eigenvectors |n). We now require 


N=W'w. 
Hence, due to the normalization, it follows from 


n= (n|N|n) = (nY Yn) x (n—A|n—1), for n>0, 


302 4 Quantum Mechanics I 
that 
Win)=|n—1) Jn OW nt) = |n +1) Vn 1, 


if we choose the phase factor (arbitrarily) equal to 1. The operator Y thus reduces the 
particle number by one, and is therefore called the annihilation operator, while the 
adjoint operator YW" increases it by one and is therefore called the creation operator. 
This leads to 


Th hs 
a ca 0) . 


i.e., all states can be created with this from the “vacuum state” |0}. It is special 
insofar as the annihilation operator Y maps only this to the zero vector |o). We 
have W'W |0) = Jo), but YW" |0) = |0), and generally, W*W|n) = |n)n as well as 
WW" |n) = |n)(n + 1), for all natural numbers n. Hence, we arrive at the basic com- 
mutation relation 


[ww] = ww -yty =1. 


Thus YW" = 1 + N holds, and we obtain from Wi Wt = Wi (1 + N), or the adjoint 
Yyy = (1 +N) Y, 


IN, Y= Yt, IN, Y]=-Y. 


Conversely, from [W, WÝ] = 1, we can derive the real eigenvalue spectrum of 
W'wW and the matrix elements of W and W" in the eigenbasis of this Hermi- 
tian operator, for an appropriate phase convention. In particular, from Y' WW = 
(VY? —[W, WT] = WCW — 1), we conclude that the operator Y creates more 
eigenvectors of WW from eigenvectors of WW, but with an eigenvalue that is 
reduced by one. On the other hand, this decrease finally has to lead to the zero vec- 
tor, and therefore to an end, since (...|W"W|...), being the square of the norm of the 
Hilbert vector Y|. . .}, may not become negative and yet is still equal to one of the pos- 
sible eigenvalues of WW. Hence W*W has the natural numbers as eigenvalues, and 
we choose the phases such that W*|n) = Jn + 1) /n +1 4> WIn) = |n—1)/n 
holds. 
Using the field operators, we can expand the projection operator |0) (0|: 


|0)(0| = yo “ww, 


n 


since, for all natural numbers m, (m|0)(0|m) = mo = (1 — 1)” = $, (—)" ) and 
(mY W"|m) =n! WR and the operator is diagonal in the basis {|m)}. 


4.2 Operators and Observables 303 


Even though the operators YW" and YW have discrete eigenvalues, there are 
infinitely many of them. Hence the associated basis is of infinite dimension and the 
traces of both e diverge. Only then can tr(WW") = tr(W*W) hold on the one 
hand and YYŻÝ — Wi W = 1 on the other, whence tr[W, YY] Æ 0 is valid. 

Even if we reject very large eigenvalues n >> 1 as unphysical, a finite basis does 
of course exist. To investigate this possibility more closely (we shall use it in the next 
section, but only there), we introduce an upper limit s and require n € {0,..., s}. 
Here, s is then assumed large, so that the finite basis comprises all physically neces- 
sary states. However, we can then no longer require [V, Y*] = 1, since for a finite 
basis, the trace of the commutators must vanish. But according to Pegg and Barnett 
[4], there are operators (we shall call them Ù), which act on physical states like Ų 
and nevertheless need only a finite basis: 


B=) vawl Se HSV vaa- 1. 
n=1 


With the finite sum and with 1 = )~_, In) (n|, we now obtain 
[, 1] =1—|s) (s + 1) (sl 


The new term ensures that tr[V, Y*] = 0, as is appropriate for a finite basis. 
Before we make any further use of the field operators for bosons, let us make here 
a brief mention of the field operators for fermions, even though we shall treat these 
in more detail in Sect. 4.2.10. We use once again N = W'W, but N will only have the 
eigenvalues 0 and 1, as required by thePauli principle, and Y? and hence also (Y+)? 
will always be zero. We write the two states as column vectors, with |0) as (°) and 


|1) as (o); for the number to increase upwards (conversely, for bosons, the state |n) is 
a column vector with just zeros and a | in the nth row). Then all these requirements 


can be satisfied with 
~({10 ~ {00 
e (oo) Ye (to), 


and consequently 


For fermions, it is thus the anti-commutator of W and YË which is equal to 1: 
VY+ wiv =l, 


and Yt |0) = |1), WT |1) = |o) = W |0), Y |1) = |0). We often find 0 written here 
instead of |o), even though A| y) is a Hilbert vector. 


304 4 Quantum Mechanics I 


4.2.9 Phase Operators and Wave—Particle Dualism 


The natural numbers as eigenvalues fit into the particle picture. But because of the 
necessary interference, we need an uncertain particle number for wave—particle dual- 
ity, thus a superposition of different states |n). Then the initial equations [W, WÝ] = 1 
and N = WW are still valid further on, but now the phase factors are also important 
in the wave picture for the superposition of different states |n). 

The appropriate determination of the phase operators was long a subject of 
research. Dirac was himself occupied with this in 1927. Only Pegg and Barnett 
(see the last section) succeeded in solving the problem: the basis must not be infinite, 
but only of unmeasurably high dimension. Let us discuss this now, but simply set 
the phase of the vacuum equal to zero, not leaving it open. 

The phases @ are unique only between O and 2x. In order not to introduce a 
continuous basis (with improper Hilbert vectors), we set 


be ith me {0 } 
m = m , WI m MEERE. n r 
s+1 


Here (as in the last section), we take a very large limit s, but nevertheless finite, since 
the phase (like any continuous quantity) cannot be measured with arbitrary accuracy. 
We also introduce a Hermitian operator ® with eigenvalues m, such that ®|¢,,) = 
|m) m. It is important to show that the states m = 0 and m = s are neighboring 
states. Hence, initially, we search for the unitary operator E = exp(i®) with the 
property E |b) = |Om) exp(id»») for m € {0, ... s} (see Fig. 4.3). 

The basis {|¢,,)} is assumed orthonormal and complete. Then we have 


E= 2 Ibn) exp(idm) (Pml with (mlm) = mm . 


m=0 


It should be stressed here that s is assumed very large, even though we leave out 
lims—oo in front of the sum. In contrast to the last section, however, we may anticipate 


Fig. 4.3 Eigenvalues of the operator E = exp(i®) with ® = ®t. These are evenly distributed over 
the unit circle in the complex plane. Here s = 44 has been chosen. It could be much larger. The 
only requirement is that it should be finite 


4.2 Operators and Observables 305 


that all states |m) can be important physically, those with m ~ s as much as those 
with m ~ 0, while we shall not take the particle number to be arbitrarily large. 

We now relate phase and particle number states. The wave—particle duality allows 
a sharp phase only for an uncertain particle number, and conversely a sharp phase 
only for an uncertain particle number. Here we simultaneously require the expansion 
bases {|n)} and {|¢,,)}, and use 


Om) = Yo |n)(aldm) and Jn) = Y [bmn (mln) , 
n=0 m=0 


with (mln) = (n|@m)*. Here the same limit s was deliberately chosen for both expan- 
sions, since the last equations are then fully valid—approximations were made pre- 
viously, in particular, with discrete phases instead of continuous ones and with a 
finite number of particles. If s is sufficiently large, these assumptions are probably 
justified. 

As the eigenvalues show, E is unitary (EE = EE = 1). With QİY = N and the 
known decomposition into amplitude and phase factor, we set 


V=EVN = WH=VJNE'. 


Hence, Y = $$ |n — 1) y/n (n| implies that 


E = |n — 1){n| + |s)(0] . 
n=1 


The last term results from the unitarity of E (where we have chosen the phase 
factor equal to unity). Consequently, we have (n|E = (n+ 1|, for 0 < n < s, and 
(s|E = (0|. Hence the eigenvalue equation of E delivers the recursion formula 
(n+ Ibm) = (n|E|m) = (nldm) exp(idm). If we choose the phase of the vac- 
uum state |0) (arbitrarily) equal to zero, we find (n|¢,,) = exp(ingm)//Ws + 1 as 
a solution of the recursion formula, where the normalization factor results from 
1 = (mlm) = Wn=o(Pmln) (nl bm). 


Hence in the basis {|7)}, the three matrices N, W (or W), and E read 


000... ovi 0 --: 010... 

010... 00 v2... 001- 

N2ļ|002--| yaļoo 0 -| gaļ000--. 
1 


The element | in the matrix E stands at the end of the first column—then E = exp(i®) 
is unitary and ® cyclic. 

From the expression for E in the particle number representation, we have the 
commutation relation 


306 4 Quantum Mechanics I 
[E, N] = E —|s) (s + 1) (0| , 


and also [E’, N] = —[E, NJ’ = —E* + |0) (s + 1) (s|, along with [V, ¥*] = 1 — 
|s)(s + 1)(s|. We now decompose the unitary operator E like exp(iġ), using Euler’s 
formula to obtain 

E+Et E-E) 


E=C+iS, with C= =C, and S= — a5). 
2, 2i 


and find 


1 
[C, N] = +iS + = Ov{s1 = Is)(01) , 


s+1 
2 


[S, N] = —iC +i (10) (s| + |s)(O)) , 


for these Hermitian operators, along with C? + S? = 1 ((E + Et)? — (E — E")’) = 
F(E E’ + E'E) = 1. Note that, we have [C, S] = 0, because [E, Et] = 0. 

According to Sect. 4.2.7, we may now derive an uncertainty relation between 
particle number and phase. In particular, we make use of AA - AB > 5 |[A, B]| for 
A = A‘ and B = B’. Since not all physical states overlap with the state |s}, we obtain 
initially 


AC-AN>4|S| and AS-AN>35(|C|, 


along with AC - AS > 0. 
If we now associate the phase operator ® with the unitary operator E, according 
to p. 293, viz., 


E=exp(i®), with d=’, 
then, for small phase uncertainty A® <« z, 


Cx cos®, AC © |sin®|A®, 
S È, AS X |cos®|A®. 


Hence the above-mentioned uncertainty relations deliver the inequality AN - A® > 
5 already announced in Sect. 4.1.3. However, this is not generally valid. If, for exam- 
ple, all phases between 0 and 27 are equally probable (the phase uncertainty thus 
being as large as possible), then C and S are both zero. Then AN = 0 may hold, 
even though we have A® = z/,/3. Note that the phase uncertainty depends on the 
reference phase—this is connected inextricably with the periodicity. In particular, if 
two neighbouring phase states om and Øm+1 are occupied with equal probability, then 
form < s, the uncertainty for s —> œ is negligible, while for m = s, it is equal to z. 
The reference phase is then chosen so that A® becomes as small as possible, or so 


4.2 Operators and Observables 307 


AN 


0,0 . 
0,0 0,5 1,0 15 %3 Að 


Fig. 4.4 Uncertainty relation between particle number and phase. If all phases are equally probable, 
then the phase uncertainty is A® = x /v/3 and never greater. The continuous curve shows AN - 
A® = 1/2, thus the approximation for AP ~ 0. The dashed curve shows the approximation for 
AN 7% 0. The two approximations complement each other quite well—only for AN ~ 1/2, or 
A® ~ 1, do they differ somewhat from the true curve, and this can be seen only if the image is 
enlarged. See Phys.Lett. A 218 (1996) 1 


that ® ~ x. For AN ~ 0, it is better to take three neighboring states in the particle 
number representation with the amplitudes 


Wily) =V1-(ANP and GE1IV) = man AN, 


since, after a Fourier transform, we obtain a phase uncertainty that is as small as 
possible: 


(AP)? > in? — 4V2 ANY 1 — (AN)? + 4 (AN)? 


Here, when calculating (®) and (7), we replace the sums over ¢,, by integrals and 
(bly) by (2m)! >=, exp(—ing) (n|y). The exact limit is shown in Fig. 4.4. It is 
rather well described by either approximation. 

Hence now we understand that the particle and wave pictures—granularity and the 
capacity to interfere—are not in contradiction if we take into account the uncertainties 
in particle number and phase. 

In the rest of this chapter (Quantum Mechanics J), we will consider only one- 
particle states (as representative of a statistical ensemble of bosons or fermions), but 
now these particles will no longer be restricted to a single state. 


308 4 Quantum Mechanics I 


4.2.10 Doublets and Pauli Operators 


The two-dimensional vector space is highly instructive and full of possibilities for 
applications. It is needed for the spin states of fermions with spin 1/2 (e.g., for 
electrons), for isospin (neutron and proton states as the two states of the nucleon), 
and also for the Pauli principle and model calculations of the excitation of atoms 
(Sect. 5.5.7). 

If we call the two states | +) and | |) (up and down), then, according to Sect. 4.2.8, 
the umklapp operator W can be introduced for this system with the property 


Yy’ + yy =l. 
We now write ||) instead of |0) and |) instead of |1), because up and down are 


easier to remember than the position of 0 and 1. If we consider these states as column 
vectors, then with Wt | |) = |1), Y |1) = |o) = Y | 1), and Y |f} = | 4), we have 


Il) 


Il) 


y wi 


yyt S l yy S 


O O= O 
O m= O O 
ooor 


All other 2 x 2 matrices can be expressed as linear combinations of these. However, 
we prefer to have Hermitian matrices as a basis, including among them the unit 
matrix: 


Co = YY + Yy# = 1 2 


4 os 1 
Z T > = 
C= W+y =- 0): 


Il) 


Q= i(¥-W") =o, 


G = Vib — UW =o, 


Il) 
a 
or 
Lo 
M” 


The notation C, is taken from Sect. 4.2.5, but the notation with 1 and the Pauli 


operator © is more often used, where o+ = 4 (0, + i0)) is introduced. Clearly, we 
also have 
Or —10y $ 0, +i 
y = oe = 0, vis a ae = 01, 
j l-0 j 1+0 
yyt = Z yy = i; 
2 2 


The operators of the new basis {C,,} are not only Hermitian, but also unitary, this 
resulting from the necessary normalization in tr(C,' Cy) = by trl = 2 dyy, since 


4.2 Operators and Observables 309 


their squares are equal to the unit operator: 
Cn = G;! = Ca » tC, = 2 dno « 


Hence, their eigenvalues are real and of absolute value 1. They result from the last 
equation: Co has the two-fold eigenvalue 1, while the other three operators each 
have the eigenvalues +1 and —1. In addition, these three do not commute with each 
other, but anti-commute. Matrix multiplication delivers Cı C2 = i C3, and because 
Cnt = Cn, we thus have C2C1 = —i C3 = —C, C2. Cyclic permutation of the indices 
1, 2, 3 is allowed, since CyC3 = C2(iC2C1) = i Cy = i C1C2C2 = —C3C2, and so 
on: 


Ci =i1C3 = —-C2C; , or [C1, C2] = 2i C3 and cyclic permutations. 


Here the notation with the Pauli operator o proves useful because the commutator 
can then be written as a vector product: 


Oxo=2i0. 


The vector product of the Pauli operator o with itself does not vanish, in contrast to 
what happens with classical vectors, because its components do not commute with 
each other. Hence apart from 1, only one further component can be diagonalized—in 
our example, this is o} = C3. But it could also be ox or o,. Here only a rotation 
(unitary transformation) would be necessary, but note that o, would then no longer 
be diagonal. 

The four operators are orthogonal to each other: 


tr(CaCn') = 2 Snn! . 


Here we recognize why in Sect. 4.2.5 the normalization of the basis operators was 
left open. With this orthonormalization, according to p. 297, all 2 x 2-matrices A 
can be written in the form 


A 1trA + 0 - tr(o A) 
= 5 : 


Their eigenvalues follow from det (A — a1) = 0, hence a? — atrA + det A = 0, which 
implies 


_ tA + / (tA)? — 4 det A 
= e 2 . 


If we expand the eigenvectors |+) in terms of the other basis {| 1), | 4)}}, then from 
A|) = |) a+, we obtain the homogeneous system of equations 


310 4 Quantum Mechanics I 


It) 


Fig. 4.5 Level repulsion. For fixed interaction V = 2({ |A | ¢), the splitting of the eigenvalues a+ 
(i.e., a4. — a_) 1s shown here (red) as a function of the unperturbed level distance 6 = (t|A|4)—(1 
|A | 4). Here the state |+) goes from ||) to |f}, the state |—) from —|f)} to ||). Without anti-crossing, 
the dashed blue lines would be valid 


(TI A= a+ |t) (tlt) + (tl A 


lt ) QI) =0, 
QI A It) (tl) + | A a E) =0. 


This fixes the expansion coefficients only up to a common factor, but because of 
the normalization condition |(f |+)|? + |(J |)? = 1, only a common phase factor 
remains open. If (for A = A‘), we set 


It+)\ _ cosa esing) (|) 

|-)) ~ (ef sina cosa Ly)? 
with real parameters 0 < œ < ir and 0 < < 2x, according to p. 293, and using 
the abbreviations (for a Hermitian operator A they are real) 


ô=(TIAIÐ-QIAI4) and A=a,—a_ > |ô], 


since A = y8? + 4|(}4 |A|*)|?, we obtain the equation 


! 2(LIAIt) 
tana = —~—_" , 
exp(i6) tana Kae 
The phase of (|| A | Ù) is thus equal to 6 and that of (t|A|J) is equal to —6, while 
ô and A determine the parameter a: 


A+6 . /A—6 
cosa = ,/ ——— and sina = ,/ ——_. 
2A 2A 


If A is the Hamilton operator H, then, with A > |6|, we speak of level repulsion or 
anti-crossing. Once the off-diagonal element (Ņ |A |‘) contributes, the separation 
between the eigenvalues increases (see Fig. 4.5). 


4.2 Operators and Observables 311 


4.2.11 Density Operator. Pure States and Mixtures 


The properties of a given statistical ensemble can be determined by appropriate 

measurements. They deliver the expectation values of the corresponding Hermitian 

operators A, and hence we arrive at conclusions relating to the state of the ensemble. 
So far we have dealt only with pure states |y). Then we have 


(A) = (YIA l4) = $ (yin) (n'l A Intal) , 


nn! 


if we assume a countable basis, otherwise there is a double integral instead of the 
double sum. The expression on the right can be simplified to $- „(n| wv) (w|A|n). Hence 
we may also write (A) = tr(P,, A), where Py = |y) (y| was introduced on p. 291 as 
the projection operator acting on the (pure) state |y). 

A finite number of measurable quantities suffices to determine the given statistical 
ensemble uniquely. An ensemble of experimental values {(A;,)} will describe our 
object. For example, in Sect. 4.1.4, we took the ensemble of electrons with momentum 
(P) and spin polarization (S). But then the statistical ensemble does not need to form 
a pure state |). It may also be a mixture thereof, thus an incoherent superposition 
of pure states |n) (or projectors |) (n|) with probabilities p,. Hence, instead of Py, 
we now take the general density operator 


p=} in Pn (nl <=> (A) =} pm (nj An) = tA). 


The ensemble of experimental values {(A;)} fixes the density operator p, since the 
matrix elements of A follow from the relevance of this operator (position, momentum, 
energy, etc.) and hence {(A;,)} = {tr(@ A;)} is an inhomogeneous linear system of 
equations for the matrix elements of p. Here the density operator describes the given 
system and the Hermitian operators A; the observables. 

The properties of o compiled in the following are valid for pure as well as for 
mixed states (as shown below, the two kinds of states can be distinguished by the 
easily verifiable attributes of the density operator). We want to fix p only by {(A,)} = 
{tr(o A;)} and make use of known properties of the observables. 

The density operator is a matrix of finite dimension, determined by a finite number 
of experimental values. Hence the operators commute in tr(pA). 

All Hermitian operators A have real expectation values. Hence also the density 
operator is Hermitian: (A) — (A)* = tr{( — př) A} must always vanish. In addi- 
tion, all observables with only positive eigenvalues (so-called positive-definite oper- 
ators) have positive expectation values, so the density matrix has to be positive- 
semidefinite—none of the diagonal elements of p can be negative in any represen- 
tation. Since the unit operator always has unit expectation value, the trace of o must 
be equal to 1. Thus we can list a total of three requirements, viz., 


312 4 Quantum Mechanics I 
pans (npl) z0, tea, 


and this actually results in the fact that p is positive-semidefinite, with p = pt. Here 
the diagonal element (n|p|n) gives the probability (or probability density) for the 
state |n), while the last equation corresponds to our normalization condition for the 
probabilities. The off-diagonal elements lead to interference and are occasionally 
referred to as the coherences of the system. 

Under unitary transformations all operators change, including the density opera- 
tor, according to the prescription A’ = UAU*. But here the expectation values remain 
the same, because with the finite dimension of p, the trace of p'A' = UpAU* remains 
constant, according to p. 294. 

For a pure state, we have p = p°, but not for a mixture. In the eigen-representation 
of p, p? is also diagonal, and for a pure state only one of these diagonal elements 
is different from zero (namely 1), but for a mixture at least two are different from 
zero—and these are then smaller for p? than for p. With tro = 1, it thus follows that 


tro” = Ifor all pure states, 


tro? < 1for all mixtures. 


With the trace of p°, we thus have a very simply test of whether we are dealing 
with a pure state or a mixture, since for the trace, we do not need to search for the 
eigen-representation, because the diagonal elements suffice. 

In particular, for a two-level system with tro = 1 and because tr(op) = (0), 
according to the last section, we have 


_ 1+0-(0) 


d to? = 
7 an Tp 


The quantity (©) is called the polarization. Since the eigenvalues of the components 
of o are equal to +1, we have |(o)| < 1. If the equality sign holds here, then we have 
a pure state, otherwise a mixture, e.g., for an unpolarized state (o) = 0: unpolarized 
electrons form a mixture, their two spin states being incoherently superposed. 

For an N-state system, the density matrix has N? elements, which are determined 
by equally many real numbers because p' = p. One of them is known already due 
to the normalization. Thus N? — 1 experimental values suffice for this system. In 
contrast, we could fix a pure state with just 2N — 2 real numbers, or N complex 
numbers, but where two real numbers are omitted because of the normalization and 
the arbitrary common phase. For the density operator, there is no arbitrariness in the 
phase—its bearing on the bra- and ket-vector cancels. 

The smaller tro”, the less pure the N-state system appears, and trp” is smallest 
when all eigenvalues are equal, in which case it is a complete mixture, and then p is a 
multiple of the unit operator, in particular, with the eigenvalues N~', since trp = 1. 
Hence we have upper and lower bounds for trp: 


4.2 Operators and Observables 313 
—< trp? <1. 


These only depend on the dimension N of the Hilbert space. 
Let us consider these properties for the operator basis {C,,} from Sect. 4.2.5. For 
Hermitian basis operators, the expansion coefficients are real, and we have 


N?-1 1 N?-1 


1 
= = n\n d tro” =- ae 
p = 2 En (Cn) an rp X (C) 


n=0 n=0 


If Co is a multiple of the unit operator, then the old requirement trCo? = c in the 
N-dimensional Hilbert space leads us to Co = ./c/N 1, and hence with tro = 1 to 
(Co) = trp Co = /c/N. Then only the remaining N? — 1 expectation values (Cp) 
are important, and these can be taken as components of a vector, usually called the 
Bloch vector (more on that in Sect. 4.4.3). The square of its length is 


N?-1 


YG? = ¢ (tre? - 5), 


n=l 


thus zero for complete mixtures and greatest for pure states, when it is equal to 
c(1-—N7}). 


4.2.12 Space Inversion and Time Reversal 


With a space inversion Y, the space directions are reversed, and with a time reversal 
J , only motions are reversed: 


PRA '=-k, FRI '=4R, 
PPA '=-P, TPF '=-P. 


The space inversion is a unitary transformation, but not the time reversal, since 
unitary transformations do not change algebraic relations between operators— 
however, 7[X,P] Z~! =[IX7-!, FPIF—"] is equal to —[X, P]. This can 
only be inserted into the previous context without contradiction if Y is an anti- 
linear operator, thus changing all numbers into their complex conjugates, and hence 
J (ih) Z~! into —ih 1. 

For anti-linear operators, according to p. 289, we have J |y a) = (7 |W)) a*. If 
we set 


Iv) = 7 iY), 


314 4 Quantum Mechanics I 


then |y) can be expanded with |Y) = >°,, |7) Yn, IY) = $, In) Yn*, and correspond- 
ingly (p| = -,, Gn(n|. We obtain generally 


(lIV) = ply)" 


Note that |Y) = Z |y} depends anti-linearly on |y}, as does the scalar product (g|y). 
Consequently, its complex conjugate value depends linearly on |y). Correspondingly, 
from |x) = A |Y), we infer |x) = ZAJ! |y}, and then also 


(g| TAT ip) = (p |A ly)" 


For A = A’, this is equal to (|A|g). In particular, we have (¢ |R |Y) = (YIR |p) 
and (gy |P |y) = —(W| P |p). 

If |y) is an eigenstate of .7, then its phase influences the eigenvalue. In particular, 
if Z|w) = |W) holds, then so does Z (\w) et) = |y) eo! = (|r) el?) e-t. Thus, 
the two eigenvalues differ by the factor e~7*. Hence we can fix each state via the 
time reversal behavior of the phase, but we cannot assign a quantum number to the 
time reversal. 

For particles without spin, after the choice of a basis with unique phases, the 
complex conjugation operator .% can be used as the time reversal operator 7. Then 
we have .7* = 1, independently of the choice of phases. 

For particles with half-integer spin, we also have to consider Zo J~! = —o. 
For motion reversal, in particular, the spin becomes inverted along with the angular 
momentum, since the spin is to be understood as an eigen angular momentum S, 
as we shall see on p. 329. Now, according to Sect. 4.2.10, 4 (Ox, Oy, Oz) KH = 
(Ox, —Oy, 0z) holds. Hence only .7 = io, % leads to the final behavior, where the 
phase factor i is arbitrary, but then the factor in front of % corresponds to a rotation 
through the angle x about the y-axis. Independently of this choice of phase, we now 
have 7? = —1 (for spin-1/2 particles), a truly astonishing result, since classically 
the two-fold reversal of the motion leads back to the original state. But note that a 
360° rotation of a spin-1/2 particle leads to the state with the opposite sign. 

For 7? = +1, we have the equations 


($I) = BI = GI) =I) . 


From this it follows for Z? = —1 (half-integer spin) that (Y |Y} = 0 . For half-integer 
spin, the states |y) and |y} are orthogonal to each other and hence different. Since 
the Hamilton operator is generally invariant under time reversal, i.e., H = ZH7~', 
fermions always have pairs of states (|Y), |¥)) with equal energy. This is known as 
Kramers theorem. For bound states, |y) and |y} differ by the spin orientation. 

The eigenvalue of the space inversion operator Z is the parity. Because P? = 1, 
it takes the values +1. 


4.2 Operators and Observables 315 


4.2.13 Summary: Operators and Observables 


In every physical theory, there are observables (measurable quantities). In quantum 
theory they are described by Hermitian operators, the eigenvalues of which corre- 
spond to the possible experimental values. Then the associated eigenvalue always 
results as the experimental value, and the observable is sharp (certain). Otherwise, 
the eigenvalue a (of possible experimental values) results with a statistical weight 
(or probability) given by (a| p |a), so that, on the average, the expectation value is 


(A) = Val playa. 


a 


For a pure state |y), itis (a| p |a} = |(alyr)|?, so p = |Y) (WI. For the uncertainty (the 
average error), we have AA = y (A?) — (A)? with (A?) = >>, (a| p |a) a°. Hence 
AA = 0 for p = |a) (al. 

Non-commuting operators have no common set of eigenstates. Hence, not all the 
corresponding observables can be sharp at the same time. In particular, the uncertainty 
relation AX - AP > sh follows from the commutation law 


[X,P]=ihl, 
with which we shall deal later. Here X and P have continuous eigenvalue spectra 


which differ from the operators so far considered, and which require improper Hilbert 
vectors. 


4.3 Correspondence Principle 


4.3.1 Commutation Relations 


According to p. 300, we can ensure Heisenberg’s uncertainty relation 
k Lp sk 
AX* - APy = xh by, 


by assigning Hermitian operators X * and P; to the complementary variables position 
and momentum which obey the commutation relations (of Born and Jordan) 


[X*, Pe] =ih dh 1. 


Since the commutator is proportional to the unit operator, the product AX* - AP% of 
the uncertainties cannot be smaller than %A/2 for any state |). 

Here once again we shall always deal with pairs of canonically conjugate quanti- 
ties and hence rely on Hamiltonian mechanics. The commutators correspond to the 


316 4 Quantum Mechanics I 


Poisson brackets, as we shall now show, since we shall use this key idea repeatedly 
to translate between classical and quantum dynamics. 

According to p. 124, all pairs of dynamical quantities u, v have a Poisson bracket 
defined by 


du dv ðu dv 
b= (SF ap 7 a Dk 5a. 


which does not depend on the choice of canonical coordinates x* and momenta 
Px = 9L/dx* (otherwise it would not be canonical). In particular, classically, we 
have 


[xt x] =0= [Pes Pe] [x*, px] = ôk i 


If we now require the classical Poisson bracket [u, v] to become the expression 
[U, V]/iħ in quantum theory, 


1 [U,V] 
[u,v] => a 


where U and V are the Hermitian operators in quantum theory corresponding to the 
classical u and v, then we do indeed have 


[X*, X] =0 = [Pk, Py], [X*, Py] = iħ ô% 1. 


If we replace the classical observables by Hermitian operators and the Poisson 
brackets by commutators divided by ih, then the uncertainty relations are satisfied. 

Since position and momentum operators do not commute with each other, in 
quantum physics no state can be given which contains position and momentum 
simultaneously as characterizing items. We have to choose: either the position alone 
or the momentum alone. But for each additional Cartesian component a new choice 
can be made. With each new degree of freedom, the state is amended by a new 
quantum number. 

With [X, P] = ih 1, according to p. 289, we have [X, P”] = nihP’"!. This is 
also true for negative integers: [X,P~"] = P™” (P"*X — XP") P-" = —niñh P™-!, 
Since the operators X and P have continuous eigenvalue spectra, in their eigen- 
representation, the derivative with respect to X, or indeed P, makes sense—it is 
simply the derivative with respect to the eigenvalue x, or the eigenvalue p. Hence, 
we write generally, 


[X, f(P)] = ih —_, FX), P]=i 


df (P) p VOO 
dP dX ` 


It follows in particular (see the Hausdorff series on p. 290) that 


4.3 Correspondence Principle 317 
exp(ia-P) R exp(—ia-P)=R-+ifia. 


According to this, the unitary operator exp(ia- P) shifts all positions by ña, so 
it is a displacement operator. Furthermore, in classical mechanics, the (canonical) 
momentum is the generating function for infinitesimal displacements (see p. 130). 
Correspondingly, we have exp(ia- R) P exp(—ia- R) = P— fia. 


4.3.2 Position and Momentum Representations 


In the real-space representation, the position operator X is diagonal. We restrict 
ourselves initially to one dimension: 


(x| X |x’) = x d(x —x’). 


In this representation the momentum operator P follows from the commutation rela- 
tion [X,P]=ih1, since from ih d(x —x’) = (x| XP — PX |x’) = (x — x’) 
(x| P |x’) with 6(x) = —x ô' (x) (p. 21), we obtain 


ð 0 
(x| P |x’) = —ih aa d(x —x') =ih ag ôx- x’). 
x x 


Hence we have (x| P |Y) = f dx’ (x| P |x’) Wa’) = —iħ dy (x)/dx. This can also be 
used for higher powers of P in the real-space representation, since for 


(x| P” |W) = fw (xl P x P"! A), 


the integral can be simplified with the delta function to —ifid(x| P"! |y} /ax = 
(—ih)" 0" yr/dx". 

In the real-space representation, we may thus replace P |y) by —i dy/dx. This 

is usually abbreviated as 
ah 
P = 7 
i 
which is of course true only in the real-space representation, if P acts on (x|y) = 


w(x). Correspondingly, in the displacement operator U = exp(iaP), all powers of 
the derivatives with respect to x occur: 


CO 

; a (ha)" d” 

exp(iaP) = > ; 
= n! dx 


’ 


as we also expect for the Taylor series. Note that, with 


318 4 Quantum Mechanics I 
UXU`! =X +ħa and (X + ha)|x — ha) = |x—ha)x , 
UX |x) = U|x)x leads to U|x) = |x— ħa), or to |x + ha) = U*|x), and this in turn to 


W(x + ha) = (x + hal) = (xU |p). 
In the momentum representation and since [X , P] = —[P, X], we also have 


(pl P Ip’) =psp-p’), 
1 . a 1 h a 1 
PpIX |p) = ih = d(p—p')=~-—s5p-Pp), 
p i op 


thus (p| P |Y) = p Yp) and (p| X |Y) = ihdy/dp. 
The results are easily extended to three dimensions. With dy = V, y -dr = 
V, Y - dp, we find in particular, 
(r| R |r’) rôr-r) => 
A 
(r| Pir’) = ~V,d6@-r') => 
i 
(PIR |p) =ih V,dp—p) = 
(P|P|p’)= pd@p-p) => 


r|R|y)= ryf), 


( 
h 

(ELE he Ney, 

( 


(PIR |y) =ih V, Yp), 
piP |y) = pyp). 


This can also be used for the matrix elements of this operator between the states (| 
and |Y), if g(r) = (r |p) or o(p ) = (p |p) are known, since 


(gl A lv) = fare erin = [ep o“ P) (plA|y) . 


4.3.3 The Probability Amplitude (r | P} 


We can now determine the Dirac bracket (r |p), i.e., the density of the prob- 
ability amplitude of the state |p) at the position r, and then change from the 
position to the momentum representation. The reverse transformation is possible 
with (p |r) = (r |p)*. We have in particular p (x|p) = (x| P |p) = —ihd/dx (x|p) and 
hence (x|p) x exp(ipx/h) as a function of x. On the other hand, we also have 
x (x|p) = (p| X |x)* = —ihd/dp (x|p), and hence (x|p) x exp(ixp/h) as a function 
of p. The unknown proportionality factor thus depends neither on x nor on p. We 
call it temporarily c and determine it from the normalization condition 6(p — p’) = 
J dx (pix) (xIp’) = |c}? f dx expfi (p’ — p) x/h} = |c]? 2x R8 — p). Hence, it fol- 
lows that 27h |c|? = 1, where 27h = h (so we could just write h here, but h occurs 
much more often than h, and we shall use it here too). We choose the arbitrary phase 
factor in the simplest possible way, viz., equal to unity. Then, 


4.3 Correspondence Principle 319 


cep) = SPEU gy = PEPI 


20h J 20h 3 


For time reversal (motion reversal), according to p. 314, we then also have (x|p) = 
(x|p)* = Qrh)~'? exp(—ipx/h) = (x| =p). 
The probability of the state |p) in the space element d?r about r is now given by 


(r p)? dr = E5 
(27h)3 
It does not depend on the position, so is equally large everywhere. (For a state with 
sharp momentum, whence AP = 0, AX must be infinite!) Note that the integral over 
the infinite space does not result in 1, as we should require. The improper Hilbert 
space vector |p) is not normalizable, so we need an error AP > 0. 

For the superposition of several states, interference shows up. If, for instance, the 
state |y) contains the momenta p; and pz with probability amplitudes (p;|w) and 
(p2|v), respectively, then the associated probability density is 


2 _ | expGpi : T/A) (pil) + exp(ipe : r/h) (poly)? 


I(r Iy) Conky 
L Kei | yy ly) ppi r |’ 
rh)? (Pily) h | 


It now depends on position, in particular in the direction of p2 — pj, and periodically, 
with the wave vector 


_ P2—Pı 
= ee 


k 


This we interpret as the interference of probability waves with wave vectors kı and 
k2. Hence we arrive at the de Broglie relation 


p=ihk. 


It follows therefore from our assumptions. 

It is clearly more convenient for the exponential function to use the wave vector 
k instead of the momentum p, since then the denominator A drops out, and they are 
related to each other simply via the de Broglie relation. Hence, |p) is often replaced 
by |k)—both states belong to the same ray in (improper) Hilbert space, but are 
differently normalized. With (p |p’) = 6(p — p^) = 6{h(k — k’)} = h-3(k|k’), we 
have 


exp(ik -r ) 
Vr? ` 


= and hence (r |k) = 


320 4 Quantum Mechanics I 


The transition from the momentum space to the real space representation (or vice 
versa) is a standard Fourier transform (see p. 22), since we have 


(rly) = f ax (rik) (k|y) = [ee exp(ik 1) (kiy). 


1 
J2n 3 
Actually, we should have used the term wave vector representation instead of momen- 
tum representation—other authors do not distinguish between these notions and 
simply state that they could have set fh equal to 1. 


4.3.4 Wave Functions 


The wave function of a state is usually understood to be its real-space representation: 


vir) = (rl), 


but generally the representation can be in any basis. The real-space representation 
is often stressed too strongly, since the momentum representation is more suitable 
for scattering problems and the angular momentum representation for problems with 
rotation invariance. We shall thus proceed here in a way that is as independent of 
the representation (as coordinate-free) as possible. The real-space representation is 
preferred by many, and even if sometimes obvious, it is often rather inconvenient and 
in principle not superior to the other representations (as emphasized by H. S. Green 
in the introduction to his textbook, mentioned on p. 396). 

If |y) is a proper Hilbert space vector, then the function y(r ) must be normaliz- 
able and infinitely differentiable. With the requirement (w|w) = 1, we must have 


fer wr) va)=1, 


thus in particular w(r) — 0 for r > œœ, and y(r ) must be differentiable so that 
the momentum expectation value (w| P|y) can be calculated. Higher powers of P 
require higher derivatives, as we have seen in Sect. 4.3.2. 

We already had an example of a wave function in the last section, namely the 
wave function for a given momentum p (such that AP vanishes): 


(r |p) = (27h)~*” exp(ip - r/h) . 


However, |p) and |r) are improper Hilbert vectors. Such states are idealizations. 
An ensemble can only then be characterized by continuous variables when error 
widths (uncertainties) are included. To each continuous measurable quantity belongs 
a distribution function (density). 


4.3 Correspondence Principle 321 


Occasionally, improper Hilbert vectors are required [2]. They are very convenient, 
and with appropriate distribution functions, a fuzziness can still be introduced. For 
example, a wave packet can be formed from (r |k): 


(r |y) = fa (r |k) (k |y) = [eve exp(ik -r) y(k) . 


= 3 
For Y (k) 4 ô(k — ko), this has a non-vanishing momentum uncertainty, and for 
w(k) Æ c, a position uncertainty. 

We may ask which wave function has the smallest possible AX - AP, i.e., equal 
to h/2? According to p. 301, we must then have (X — X) |y) = —i AX /AP (P — 
P)|y) with 1/AP = 2AX /h. With (x| X |Y) = x W(x) and (x| P |W) = —iñ y' Œ), 
we atrive at the differential equation 


dig 
= ta Jv = sare ve). 


For an appropriate choice of phase, so that no integration constant remains free, its 
normalized solution reads (for the normalization of the Gauss function, see p. 23) 


WX) = 


1 x—X\2 iP (x— 5X) 
Van JAX exp| Gare. i l. 


It contains three free parameters, namely X, AX, and P, but the last drops out 
for the probability density |y (x)|?. This density is a normal distribution (Gauss 
function) with maximum at X . For the canonically conjugate variable, using a Fourier 
transform, we find another Gauss function: 


yp) = 


1 | aay Xea) 
ex : 
V27 JAP r 2 AP h 
We shall return to this result in the context of harmonic oscillations (Sect. 4.5.4). The 


phase factors exp(+ 5 iP X /h) have been added for the sake of the symmetry—then 
w(x) and w(p) are really mutually Fourier-transformed quantities. 


4.3.5 Wigner Function 


In statistical mechanics (p. 523), we introduce the classical density function p,\(r, p ) 
in phase space and use it to determine the average values 


I [aren pelt, p) Aalt, p) - 


322 4 Quantum Mechanics I 


In quantum theory this density corresponds to the Wigner function. It follows via 
Fourier transforms from the density operator p in the position or momentum repre- 
sentation. 

To show this, we adopt a basis {C(r, p )} of Hermitian, unitary, and orthogonal 
(in r and p) operators. For example, the Pauli operators are Hermitian, unitary, and 
orthogonal in the space of 2 x 2 matrices, according to p. 309. In the real-space 
representation, 


+ip: (rı — r2) 
(ri]C(@, p) nz) = 82r- 4) =r) exp—P 
and in the momentum representation 
—ir - (pi — P2) 
(pil C@, p) |p2) = ô(2 p — pi — p2) exp |. 


These are practical as an operator basis, according to Sect. 4.2.5, because 


Crr,p)=C'r,p) = Clr.p), 
(Cæ, p) Cir’, p} = Gah st- r’) êp- p’). 


As in Sect. 4.2.11, the expectation values of the basis operators are also important. 
They deliver the Wigner function 


_ (CŒ, p)) 
pr,p) = E 


which is in fact the Fourier transform of the density operator: 


+2ip-r’ 
h 
r-p 


3 7 1 1 —2i 
= | dp (pP—p |e|p +P) exp — ~ 


(Cr, p)) = far I- r'lolr +r’) exp 


/ 


Conversely, we obtain the density operator from the Wigner function (see Figs. 4.6 
and 4.7): 


h ’ 


n_ [43 pp’ —i(p- p^- r 
plop = far pfr, PEP) eap ZPP, 


f r+r’ +i(r—r’)-p 
wip) = f ëp (5, p) ep ==? 


If we integrate the Wigner function p (r, p ) over all momenta or all positions, we 
obtain the probability densities in position and momentum space, respectively: 


4.3 Correspondence Principle 323 


Fig. 4.6 Superposition of the two states wi (x) « exp{— (x F 2)7}: We — y- (left) and w+ + Y- 
(right). Below the wave functions, the density operators p = |y)(w| are shown with equal-value 
lines for p > 0 (continuous line) and for p < 0 (dotted lines), in the real-space representation 
(p(x, x’) = (x|o|x’)) and the momentum representation (p (p, p’) = (p|p|p’)). The axes can be rec- 
ognized as symmetry axes, and p is always real here. Along the diagonal x’ = x, we have p > 0, 
which corresponds to the “classically expected” density 


olx, p) 


Fig. 4.7 Wigner functions of the superpositions of states from Fig. 4.6. Equal-value lines are shown 
once again, for p > 0 (continuous lines) and for p < 0 (dotted lines). Here, p is symmetric with 
respect to the x- and p-axes. The Wigner function can be negative, which depends sensitively on the 
phase difference of the superposed states, while in the “classically preferred” phase-space regions 
(here x © +2, p © 0), there is almost no dependence 


324 4 Quantum Mechanics I 
[fp e.p)=erloin and f roep) = wie). 


Hence we have the usual normalization f dîr d’p p(r, p) = 1. Incidentally, a 
Fourier transform yields far dp pr, p)= (27h)~? tro?. We can also test 
whether we have a pure state or a mixture using the Wigner function. In addition, the 
Wigner function, being the expectation value of a Hermitian operator, is real. How- 
ever, it can also be negative, and this distinguishes it from classical density functions: 
it is only a quasi-probability, but this difference is also necessary for the description 
of interference. 

With the Wigner function, the expectation value of every observable A(R, P ) 
can be determined. In particular, if we expand the operator A in terms of the basis 
(C, p)}, viz., 


A= [fren Car, p) (2/7h)? tr{C(r, p) A} , 
according to p. 297, then using (C(r, p)) = (wh)? pol, p ), we can determine (A), 
We set 


+2ip-r’ 
h 


1 ! ! —2ir-p 
=2 fp! (p—p'lAlp+p') exp—— 


Ar, p) = 2? tr{C(r, p) A} a ag (r—r’|A|r+r’) exp 


/ 


because then formally—only formally, since the Wigner function can also be 
negative—we have the same as in statistical mechanics, that is 


(A) = f rap AE AT 


and A(r, p ) is real for a Hermitian operator A. 


4.3.6 Spin 


So far we have taken the position or momentum representation and then proceeded 
as if a (pure) state were already defined by r or p . But for electrons (and nucleons), 
we must also take into account their eigen angular momentum (spin). This degree 
of freedom must also be determined if the statistical ensemble is to be described 
uniquely. For this “inner degree of freedom”, we only require a Hilbert space of 
finite dimension. For electrons and nucleons, two dimensions suffice, so here we 
shall restrict ourselves to that situation and use Sect. 4.2.10. Hence, |r, t+) and |r, |) 
fix the state, or indeed |p, +) and |p, |). 


4.3 Correspondence Principle 325 


Correspondingly, we have to distinguish the operators by the space in which they 
act. For example, neither R nor P affects the inner degrees of freedom—they act in 
the spin space as the unit operator. Conversely, o does not act on |r) and |p). Hence R 
and P commute with o. Of course, there are also operators, which act in two spaces, 
e.g., the helicity (P -P)~'/* P - ø, for which the orientation of the spin relative to 
the momentum is important. 

If A and B do not act in the spin space, then with o = lando, Oy = —0)0; = 10, 
(and cyclic permutations), we have 


A-oB-o=A-B+i(AxB)-o. 


Since here A and B may be arbitrary vector operators, we have as special cases of 
this equation 


A-oo=A-iAxo and [o,A-o]=2iAxo. 


The unit operator in the spin space is not written explicitly, as previously for R and 
P. Moreover, on the left of the last equations, we should write 1 & o instead of just 
Oo. 

If we write o as a 2 x 2 matrix, then the Hilbert vectors in the sequence space 
must also be written as 2-spinors—for y the two elements atop each other, for yt 
side-by-side and complex-conjugate to those of y. 


4.3.7 Correspondence Principle 


In quantum theory, we describe all observables using Hermitian operators whose 
eigenvalues correspond to the possible experimental values. So far we have presented 
only two observables, namely position and momentum, but according to Hamilto- 
nian mechanics, further quantities can be derived. The corresponding observables 
in quantum theory are in general easy to find—we simply have to take the classical 
equations as operator equations: If in classical physics y = f (x, p), where y, x, and 
p have real values, then usually in quantum theory Y = f (X , P), where Y, X , and 
P are Hermitian operators. Hence we have given a mathematical form to Bohr’s 
correspondence principle. Classical and quantum mechanical quantities correspond 
to one another to a large extent, but are distinguished in their mathematical rele- 
vance, since instead of classical quantities (number times unit), we now have linear 
operators. 

However, the operators are canonically conjugate quantities and do not commute 
with each other—for products, the order of the factors is important. This difficulty 
rarely arises though. Let us take, e.g., the orbital angular momentum 


L=RxP. 


326 4 Quantum Mechanics I 


In the vector product here, all components commute without posing a problem. 
Although L does not generally commute with R and P, at least equal components 
do:L-R=R-LandL-P=P.-L. 

If necessary, we can invoke the Weyl correspondence. If the Wigner function is 
used, then a Fourier transform is allowed. In particular, if the classical function f (x, p) 
is given, its Fourier transform reads 


1 
f(a, A) = = i E E T E AA 


and its operator function (where œ and 6 remain real variables) 


f(X,P)= J dæ df exp{+i(aX + BP)} f(a, B) . 


On p. 290, we already derived the relations (note that [iaX , i6P] = —ihaB) 


exp{i (@X + BP)} = exp(iaX) exp(iBP) exp(+4ih af) 


= exp(ifP) exp(iaX) exp(—5ih af) F 


so we can determine f (X, P) from a double Fourier integral, after which we have 
found f(a, 6). In this way, f(x) p has the Fourier transformed form f(a, 8) = 
V2ri f (a) 6'(B), and hence, according to Weyl, we have to take f (X , P) = f (X) P — 
5ihf’(X). According to p. 316, in particular, with ihf’(X) = [f (X), P], this leads to 
the symmetrized product 5 {f (X), P}. Generally, the power series of the exponential 
function of i (xX + BP) leads to completely symmetrized products of X and P. 

If we use quasi-probabilities instead of the Wigner function, we have to order 
differently, as will be discussed in Sect. 5.5.6. 

Let us consider, e.g., the Hamilton operator for a particle of mass m and charge 
q in an electromagnetic field. According to p. 123, the classical Hamilton function 
is + (p — gA)- (p— qA ) + q®. The quantities m and q do not become operators, 
and in the usual quantum theory neither does the electromagnetic field—this happens 
only in quantum electrodynamics (see Sect. 5.5). Since P does not commute with A, 
we arrive at 


P -q P-A+A-P)+gA? 


H = ®, 
2m ta 


thus at the symmetrical product {P}, A*}. Now in the real-space representation, P 
corresponds to the operator —ihV, and we find V - Ay = Y V - A+A. Vy. For 
the Coulomb gauge, V - A vanishes, whence P - A = A - P holds, even though P and 
R do not commute with each other. For a homogeneous magnetic field B, we have in 
particular for the vector potential (in the Coulomb gauge) A = A B x R, and hence 
P-A+A-P=(BxR)-P=B. (Rx P)=B.-L. Here, according to p. 191, a 


4.3 Correspondence Principle 327 


point charge g of mass m with orbital angular momentum L has magnetic moment 
u= z qL, giving a potential energy —u - B in addition to g®. 

However, this ansatz does not suffice for electrons in a magnetic field because 
they have one more inner moment, which is connected to their spin and which has 
not been accounted for so far. Here it has been shown that the Pauli equation, viz., 


P? — qB. (L +o) +g A? 


2m 
P—gA)-(P—gA h 
_ (P~aA) -( A ayi g nB. 
2m 2m 


is appropriate. The new feature is the last term, where the factor 


is known as the Bohr magneton. Due to the factor o in the Pauli equation, H acts 
on a wave function with two components, a 2-spinor, which we shall discuss in 
Sect. 4.5.8. For a homogeneous magnetic field B (we restrict ourselves to this case), 
the Pauli equation can be brought into the form 


— . 2 
EA 


2m 


H q®, 


since according to p. 325, we have 
{(P—qA)-o}? = (P—qA)-(P—qA) +i {(P—qGA) x (P—qA)} -o. 


If P were to commute with A, then the vector product would vanish, but now for 
A= 5 B x R, the term P x A+ A x P = —iħ B remains, since B commutes with 
R and P and hence P x A+ A x P= 3[R,B-P]— 5B(R-P—P-R). 

In the form H = z {(P —qA)-o}?+q®, the Pauli equation is the non- 
relativistic limiting case of the Dirac equation (as will be shown in Sect. 5.6.8). 
Hence the results here do not describe relativistic effects, even though it is some- 
times claimed otherwise. Incidentally, Ao will appear as the origin of the doublets 
of the spin momentum S in the next section. In the Pauli equation, it thus occurs as 
the scalar product (L + 2S) - B. The spin momentum enters with twice the weight 
(magneto-mechanical anomaly). So this factor of 2 is not a relativistic effect. 

If the classical equations are valid for operators in quantum theory, this will also 
apply for the expectation values. However, the expectation value of a product is not 
generally equal to the product of the expectation values—that would only be true for 
eigenstates. Hence, generally, we also have A ? # A? and then AA > 0. 


328 4 Quantum Mechanics I 


4.3.8 Angular Momentum Operator 


The orbital angular momentum operator is defined by 
L=RxP, 


where the fact that R and P do not commute does not create problems, because in 
the vector product only factors commuting with each other occur together. Hence L 
is also Hermitian like R and P. 

From the commutation relations for R and P, we find 


[L,, X]=0, [L,, Y] =ihZ, [L,, Z| = —ihY , 
[Le Px] =0, [Lx, Py] = iñ P; , [Lx, P:] = —ih Py , 
[Ly Ly] = iñ L; , 


since we have, e.g., [L,, X] = [YP, — ZP,, X] = 0, but 
[Ls Y] = -[ZP,, Y]=Z[Y, Py| = ihZ . 


The above are valid for L, and L,, with suitable cyclic permutations. Hence we 
find the commutator [L,, Ly] = [Ly, ZP, — XP,] = —ih YP, + ih XP, = ihL,. Gen- 
erally, for a vector operator A, we can derive the commutation relation 


[L-e;, A- e2] =ihA- (e1 x e2), 


because, according to Hamiltonian mechanics (see p. 130), the angular momentum 
is the generating function of infinitesimal rotations. In addition, the corresponding 
equations for the Poisson brackets are valid with R or P instead of A (see Prob- 
lems 2.44 and 4.30). 

The commutation relations [Ly, Ly] = if L, (and cyclic) mean that there are gen- 
erally no common eigenvectors for all three components of the angular momentum 
operator. We can make only one component diagonal. As for the spherical coordi- 
nates, we prefer the z-component and choose e, as the quantization direction. Then 
the y- and z-components do also have unique expectation values, but with uncertain- 
ties. In general, the angular momenta in a state have no sharp direction. They are 
unsharp (uncertain), as in the time average for each precession, for which only the 
component along the precession vector is fixed. This is shown in Fig. 4.8. 

We have already encountered commutation relations similar to those for the com- 
ponents of the orbital angular momentum L, viz., for the Pauli operators on p. 309. 
These read [0,, 0] = 210, and cyclic permutations. Hence with 


4.3 Correspondence Principle 329 


Fig. 4.8 Angular momentum eigenstates. For sharp Z7, all allowable vectors L have the same 
absolute value. They span a sphere (dashed circle). Here l = 1 is chosen. Then there are three 
eigenstates |/, m) with sharp Lz, and hence uncertain Ly and Ly. Their angular momentum vectors 
thus form three cones about the quantization axis. The one for m = 0 degenerates to a circle 


we conclude 
[S,, Sy] =ihS, and cyclic permutations. 


In fact, we need S = ŁA ø for the spin (eigen angular momentum) of electrons and 
nucleons. But this is easier to treat than the orbital angular momentum, because only 
two eigenvectors occur. For the three Cartesian components, we have o; = 1, and 
hence S? = S, + Sy +S? = 3 R1. 

The square of the orbital angular momentum, viz., 


p = b+ by +L? 
is Hermitian and commutes with all components: 
[L*, L.]=0=[L’, Lx] = E, Ly], 


since 
[Le?, Lz] = LedLr, L] + [Le Leh 


is equal to 
—[Ly, LJL — Llb, L] = -[L,”, L]. 


Hence, there is a complete orthonormal system of eigenvectors of L? and L;. 

Since the operators L* and L, are Hermitian, they have real eigenvalues, and we 
shall now seek these, along with a set of common eigenvectors. From the commu- 
tation relations, we will determine the eigenvalues / (1+1) K? with Z € {0, 1,...} of 
L? and the eigenvalues mh with m € {0, +1, ..., +1} of L,, where / and m could 
also take half integer values (1/2, 3/2, etc.). But half-integer values do not lead to a 
unique real-space representation (see the next section) and are therefore to be dis- 
carded. This is different for the inner degree of freedom, where the values s = 1/2 
and m = +1/2 are allowed. 


330 4 Quantum Mechanics I 


The proof is similar to that for the field operators (see Sect. 4.2.8). We use the 
non-Hermitian operators 
L} +L- L} — L 
= —— J Ly = -=a ? 
2 f 2i 


L=L+iL=L) S L 


with the properties 


[L L+] = iL, [L},L-]= 2L, [L L+] = 0 = [L, LiL]. 


Now let |a, b) be acommon eigenvector of L? and L,, so that Z? |a, b) = |a, b) ah? and 
L; |a, b) = |a, b) bh. Then with the commutation relations, we obtain the following 
results for L+ |a, b): 


Ľ L; |a, b) = Ly|a,b) ah’, L; La |a, b) = L4 la, b) b1). 


Z 


The ladder operators L4 thus connect eigenstates of L? with equal eigenvalue, but 
with a different eigenvalue of L, i.e., L+ |a, b) « |a, b+ 1). Hence, we call L} a 
creation operator and L_ an annihilation operator. 

However, the construction method with the ladder operators has to lead to the 
zero vector after a finite number of steps, and then stop. Otherwise, the norm of the 
vectors L4 |a, b} might become imaginary. From 


L = L? + } (LyL_+L_L,) 
and the commutation relation [L}, L_] = 2A L,, it follows that 


LL = L? — L (Lh), 


and hence for the squared norm of L+ |a, b}, which is just the expectation value 
(a, b| LŁ'L4 |a, b), we obtain the value {a — b (b + 1)} h?. Hence, the expression 
must vanish for bmax and Din: 


a= Dax (Dmax F 1) = Dmin (bmin a 1) : 


We deduce that bmin = —Dmax (Or Pin = Dmax + 1, but this contradicts bmin < Dmax)- 
Starting from |a, Dyin), We must arrive at |a, bmax) With the creation operator L}. 
Hence, bmax — Dmin = 2 Dmax is an integer and the claim is proven. We denote bmax 
by / and usually write m for b. Following the usual practice, we write for short |}, m) 
instead of | / (1+1), m). Incidentally, the orbital angular momentum eigenstates are 
often not specified by the value of /, but by letters. The first four have historical 
origin, the rest follow in alphabetical order, without j (see Table 4.1). 

With the eigenvalue equations 


4.3 Correspondence Principle 331 


Table 4.1 Coulomb-state “quantum numbers” 
l 0 1 2 3 4 5 6 7 
Name |s p d f g h i k 


L?\l,m) =|l,m) 10+) FR, with le {0,1,2,...}, 
L,|l,m) =|l,m) mh, with me {0,+1,..., +}, 


the phase factors are not yet determined. But since Condon and Shortley [5], the 
phase factor for Li is chosen positive real and the relative phases of the states with 
equal / are then determined by 


Ls |l,m) =|l, m£1) IAF D- mmi A 
=|, m41) /d#mdem+Dh, 


using L+L4 = L? — L (L, +h). The relative phases of states with unequal / are 
still free. Hence we can still arrange things so that the matrix elements of all those 
operators that are invariant under rotations and time-reversal are real. This is possible, 
e.g., by satisfying the requirement 


TF |l, m) = (—)'*™ |l, —m) . 


But we shall not deal with this here, because we would then have to investigate the 
behavior of the states under rotations. 

In the states |/, m), the expectation values of L+ vanish and so therefore do those 
of Ly, Ly and Ly? +L? =2(L? = L>). Consequently, we have (AL,)? = (L2 = 
(Ly?) = (ALy)’: 


(ALY = (ALY = 5 (P-L?) = 4 UUD- mM}? > FIP. 
For fixed /, these uncertainties are smallest for m = +/ and greatest for m = 0. Only 
the s-state is such that all three components of the angular momentum are sharp. 


4.3.9 Spherical Harmonics 


The spherical harmonics are the real-space representation of the orbital angular mo- 
mentum eigenstates |/, m). However, it is not the length of the position vector that 
is important, but only its direction. Hence it is practical to calculate with spherical 
coordinates (r,6,y). With (r |R |r’) =r ô(r — r’) and (r |P |y) = —ih VW), 
we have 


332 4 Quantum Mechanics I 


e : à +e 
“sind dg | * 80” 


(|LIy) = tex Ver), with rx V= 


where (see Fig. 1.12) 


e = cos (cos Y e, + singe,) — sind e, , 
e, = — singe, + cosye, . 


The angular momentum operators thus act only on the angular coordinates Q = 
(6, o), not on the length of r. Hence, in the following, we consider 


(Q|Im) =i! YOO). 


The factor i! is a practical phase factor which turns out to be useful for time reversal. 
In particular, with Z |Q) = |Q) and F |l, m) = (—)'*"|1, —m), we find (Q|, m)* = 
(—)'+"(Q\1, —m) and hence (with the factor i’), 


yor) = (-)" YË (2). 


Consequently, all spherical harmonics with m = 0 are real—we can even arrange for 
them all to be positive for Q in the z-direction, i.e., for (0, ¢) = (0, 0). Without the 
factor i’, this would not be possible. 

Since L, in the real-space representation of the operator corresponds to —iħ 0/d@, 
and since we also have L, | lm) = | lm) mh, the function (Q | Im) must be connected 
to via the factor exp(img). It is only unique (mod 277), if mis an even number—thus 
also / must be an integer, i.e., / € {0, 1, ...}. The commutation relations also allow 
half-integer values, which would be connected with an ambiguity, and this is without 
contradiction only for unobservable internal coordinates (spin). 

We set Y D (Q) = fim(0) exp(imọ) and determine the unknown function fim using 
the ladder operators. With 


€g - (e, + ie,) = cos exp(+iọ), e - (e + ie,) = +i exp(+iọ), 


and consequently also 


(r x V)4 = exp(+ig) (— cot 0 0/dg + 10/08) , 


we have 


(Q| Le |lm) = (Q|l,m+ 1) VAF maAEm+ DA 


ə ð 
= ħ exp(+iọ) E +i cot Z) (Q | lm) . 


Hence, we obtain the differential equation 


4.3 Correspondence Principle 333 


d 
(+5 — m cot 8) fin(®) = fims 0) STF m) TE m+ 1) . 


In particular, (Q| L+ | l, +/) vanishes. Then (d/d6@ — / cot 0) fı, +ı(0) = 0, and con- 
sequently, f;+)(9) « sin! 6. The value of the still missing factors is determined by 
the normalization condition f dQ |(Q]|lm)|? = 1. From 


f sin”! @ d0 = 2 (2! 1)? /(21+1)!, 
0 


we deduce an appropriate choice of the phase: 


Dro P (2+1! 
H= oh An 


X 


sin! 6 exp(+ilg) . 


The remaining spherical harmonics are now obtained by applying the ladder operators 
L+. However, the operator +d/d@ — m cot 0 is not quite appropriate here, because it 
contains two terms. But let us consider the function sin™™” 6 fj, and take cos 0 instead 
of 0 as the variable. We only need 0 < 6 < x anyway. Then d/d = — sin@ d/d cos 0 
leads to 


d int” 0 m tht d 
dsint™ O fim = ţ sin”! ọ (4— —m cot O) fin 
dcos 0 do 


= F sin™! o fimi Vi =mVlEim+1. 


After differentiating n times, we have on the right-hand side 


(P sine" 8 fim 


(m)! (l+m+n)! 
"yY d=m—n)! (£m)! ` 


Hence, 


Fimkn = (P) sin’ fa) 


d'sin™ fi, (CEM! UFm-n)! 
dcos” 6 (=m)!(£m+n)!" 


This recursion formula connects all spherical harmonics with equal / to each other. 
This is achieved by the ladder operators, according to the last section. In particular, 
with L_ and for n = m = l, it leads to fig = d! sin! 6 fu/d cos! @ (21)!~'/?, or (see 


Fig. 4.9) 
—)' /214+1 d! sin” 6 21+1 
ya) = £ UE e a Topicos 
2! 1! 4r dcos'@ 4r 


334 4 Quantum Mechanics I 


Fig. 4.9 Spherical 
harmonics. Their positive © B = S € 
real part is shown in white, 


the negative part hatched. 


l = 0(1)2 increases upwards 
from sphere to sphere, m to 
the right. In addition, there . 


are two frames 7 z 
pa = x | 
y y 


Here P;(cos@) is a Legendre polynomial. We already met them in Sect. 2.2.7, 
when we considered their generating function 


Ss = Dries for |s| < 1. 
— 2sz+s 


They lead to Po(z) = 1, Pı (z) = z, and the recursion formula 
(n+ 1) Pari) — Qn+ 1) z Paz) +n Pri) =0. 


We also proved the orthonormalization condition on p. 82: 


1 
2 
dz Ph Py = —— bm . 
J = TT 


Hence we can also show the Rodrigues formula, viz., 


1 d” (22 _ 1)" 


Pa@) = ap dg 


Without this, we would not have met the Legendre polynomials previously at all. If 
we integrate by parts, where we may assume n < n’, then we obtain, for n’ > 0, 


1 d” (z? — 1)” d” (22 = 1)" 1 d?” (z2 — 1)” d? =” (z? = 1)” 
i dz 7 = (-)" / dz = , 
= dz” dz” Li dz?” dz” =” 
with the factor d” (z? — 1)” jac” = (2n)!. For n’ > n, this is zero and otherwise 
equal to (2n)! i do sin?! 6 = (2"n!)? 2/(2n + 1). The polynomials defined by 
Rodrigues’ formula are thus also orthonormalized like the Legendre polynomials 
and are real polynomials of the same degree. Hence, they can differ from each other 
by at most a sign. But the coefficients for the highest power are positive according 
to the recursion formula and also according to the Rodrigues formula. This leads to 


4.3 Correspondence Principle 335 


B 1 k (2n = 2k)! n—2k 
P,(Z) = mn 2 ) klnk! m—2k)! * 


From here, we have P,,(—z) = (—)"P,,(z). 

Clearly, the spherical harmonics with m = 0 are real and positive in the z- 
direction, so the choice of phase for m = +/ corresponds to our above-mentioned 
wishes in connection with the factor i’. Generally, with m > 0, fio = y® and 


fim = (F)” sin” 0 (d”fio/d cos” 0) ~y (L — m)!/(L + m)!, we obtain the expression 
2l+1 (l—m)! . d” P;(cos@) . 
y® Q = m e m E EEA + . 
Em2) = (F) ae. Cw Acard exp(img) 


For spherically symmetric problems, we will often expand the wave functions in 
terms of these spherical harmonics, beginning in Sect. 4.5.2. 

Since P;(— cos 0) = (—)! P;(cos@), the spherical harmonics with orbital angu- 
lar momentum / have parity (-)!, using the standard results sin (x — 0) = sin 9, 
cos (x — 0) = — cos 0, and exp(+im(g + 77)) = (—)” exp(tim@). 

With the spherical harmonics, we know the eigenfunctions of the operator L? in 
the real-space representation (directional representation): 


(Q|L? | Im) = (Q| lm) 1+1) h? 


1 a 3 1 2 
= (sing ) l (2| lm) R. 
sin@ 30 00 sin? 0 3g? 


We will need this operator in Sect. 4.5.2 for central fields, because, according to 
p. 142, the centrifugal potential is proportional to L’. 


4.3.10 Coupling of Angular Momenta 


In addition to the orbital angular momentum of electrons and nucleons, we also have 
to account for their eigen angular momentum (spin). Their total angular momentum 
involves both. Hence we now consider 


J=L+S. 


Since L acts in real space and S in spin space, the two operators commute. J is 
Hermitian like L and S, so 


[Jx, Jy] = thJ, , and cyclic permutations. 


Hence the considerations in Sect. 4.3.8 deliver 


336 4 Quantum Mechanics I 


J? |j,m) = |j,m) j+), forje {0,4,1,...}, 
J-j, m) = |j,m) mh, forme {j,j-—1,...,-j}, 
Jlj, m) = |j,m£1) /jG+1) —mmt)h 

= |j,m+£1) /G#m) GEm+Dh, 


T li, m) oo (-y™ li. —m) : 


We would now like to apply these general equations to the spin 1/2 case. 

Here we could take the uncoupled representation |l, mj; 5, ms) which diagonal- 
izes L’, L,, S?, and S,. But if there is a spin-orbit coupling, which we derive from 
the operator product 


L-S=L,S, + }(LiS_ + L-S4) , 


then neither L; nor S; will be sharp, only their sum J,. Then the coupled representation 
I(l, 5) j, m) is more useful, because it simultaneously diagonalizes L?, S?, J?, and 
J,, and hence also 2L - S = J? — L? — S?. With J, = L; + S;, we then have m = 
mı + ms, and for a given l,m < l + 5 = j. In fact, |Z, I; 5, 5) is also an eigenstate of 
J?7=1?+2L-S+S", since with 2L-S = 2L,S, + L,S_ +L_S,, L,|l) = |o), 
and S,.|54) = |o), we find that {7 (¿+ 1) +215 + 3} A? is an eigenvalue, and with 
J=l+ 5 this can also be written as j(j-+1) h?. Hence for j = l+ 5. we may set the 
two states |}, L; $, 5) and |(/, 5) I+ 5. I+ 5) equal to each other. Here we finally fix 
the phase of the coupled state. The remaining states with j = l + 5 are obtained from 
there with the creation operator J_ = L- + S_.Since we restrict ourselves here to s = 
5, the operator S2 turns out to be zero, and then we have J ” = L” + nL” 's_. Foran 
appropriate choice of the phase and with J_"|jj) = |j, j — n)./Qy)!n!/(j — n)! R’, 
it follows that 


}l+5—m }i+5-+m 
1 1 i ee | 1 2 ie a | 2 


We then have all 2j + 1 = 2/ + 2 states with j = l + 5 in the coupled basis expanded 
in terms of the uncoupled states. But in the uncoupled basis, there are (2/ + 1) - 2 
states with equal /, thus 2/ more states. In fact, we can also couple with j = /— L, 
These states have to be orthogonal to those with equal / and m, so the expansion 
coefficients are 


}i+4+m I+1—m 
1 1 EEn) 1 2 Ly dA 2 


We may also include a phase factor. The phase of the coupled state remains free to 
choose—only the relative phases of the states with different m are already fixed by 
the choice of matrix elements of J+. The last equation obeys a second requirement 


4.3 Correspondence Principle 337 
due to Condon and Shortley, namely, for jı + j2 > j = |j1 — Jal, 

(JD jajj) = Goji Jj- DID > 9, 
i.e., all coefficients with m = j and mı = jı are to be positive. 


Hence all expansion coefficients of the angular momentum coupling, i.e., all 
Clebsch-Gordan coefficients, are now real. Here we adopt the abbreviation 


j h 
m, mM 


but other notations do occur. We have now derived, e.g., 


1 l- EK 
I3 = a i l 1 en ee 
m 21i+1 m—>x m 


Likewise we can now couple two spin-+ states to triplet and singlet states. If 
instead of Iż, 5) we write for short | t+) (spin up), and instead of Iż, —5) the abbre- 
viation | |) (spin down), it follows that 


J 
m 


) = (jı, Mm; j2, m| Gi, jD j, m) = (G1, j2) j, ml, Mmi; j2, m), 


NISNI = 


ITHIM 
A/Z ’ 

ITH- 
v2 


The triplet states are thus symmetric under exchange of the two uncoupled states, 
while the singlet state is antisymmetric. 


IG. 4)1, +1) =|; IG, 4)1,0) = 


53L- = 44) IG 3)0, 0) = 


4.3.11 Summary: Correspondence Principle 


In the last three sections, we have worked out the basic features of quantum theory. 
The observables of classical mechanics become Hermitian operators, and relations 
between measurable quantities become operator equations. Important here is the 
commutation behavior. The commutator corresponds to the classical Poisson bracket, 
except for the factor if. The factor i has to occur for a quantity to be Hermitian, while 
here A introduces Planck’s action quantum as a scale factor. 

The comparison of the position and momentum representations {|r)} and {|p)} 
is instructive. These diagonalize the position and momentum operators, respec- 
tively. In particular, from the basic commutation relation [X k Py] =ih ôk, 1, we 
have derived the representation of each operator in the other basis, and also (r |p) = 
(p |r)* = (27x ħ)~?/? exp(ip - r/h). This probability amplitude is usually called the 
wave function of the state with momentum p. For the derivation we used the equation 


338 4 Quantum Mechanics I 


Fig.4.10 Eigenvalues of the angular momentum operator form € {—j,..., J} andj € {0, a | ESE 
Half-integer eigenvalues (open circles) occur only for spin momenta, because the real-space repre- 
sentations are then ambiguous 


xô'(x) = —ô(x) and thus found 


PNE, 
Pp=—-— and X* Sik — 
i Oxk Op 


for P; in the position representation and X * in the momentum representation. If we do 
not use Cartesian coordinates, then covariant and contravariant components are differ- 
ent. Note that the metric fundamental tensor generally depends upon the position. For 
the kinetic energy, which is a scalar, we need, e.g., the quantity )7, P,PFS — fA, 
We have already derived the Laplace operator for general coordinates on p. 38: 


1 


ə „0 
Ay = D g Ee" 


or) | with g =det(g,) . 
axk 


V8 
We have also investigated the way the non-commutability of operators affects phys- 
ical laws for the case of the angular momentum. For / Æ 0, only one directional 
component can be sharp, along with the square of the angular momenta, which has 
eigenvalues / (/ + 1) h with! € {0, 1, 2, . . .}. The directional quantum number m for 
a given / can only be an integer between —/ and +/. Using L = R x P, we derived 
these properties from those of R and P (see Fig. 4.10). 


4.4 Time Dependence 


4.4.1 Heisenberg Equation and the Ehrenfest Theorem 


We now consider time dependence. We shall be guided once again by classical 
physics. 

If a is a function of the canonical position and momentum coordinates, and also 
of the time, we have 


da ða dx* ða dpr ða 
a7 Lae dt ` apg ee at ` 


4.4 Time Dependence 339 


As already shown on p. 124, using the Hamilton equations 


dx% dH dæ 3H 
dt dp,’ dt dxk’ 


we find classically 


da da OH da OH ða ða 
— = = H —. 
dt “(gn Ope Opa a) 5 Meri 


The derivative da/dt is thus equal to the Poisson bracket [a, H], if we disregard any 
explicit time dependence. 

Now, in quantum theory, on p. 316 we already assigned the commutator of the 
corresponding operators (divided by if) to the classical Poisson bracket. This idea 
for translating between the classical and quantum cases leads us to 


dA [A,H] 9A 
= + 


dt ih ot” 

known as the Heisenberg equation. Here we have to take any time-independent rep- 
resentation and then differentiate each matrix element of A with respect to time in 
order to form dA/dt (in this representation). We shall usually restrict ourselves to 
operators A, which do not depend on time explicitly. Then all operators commuting 
with H (their eigenvalues are called good quantum numbers) are constants of the 
motion, in particular the Hamilton operator H itself. Hence the energy representa- 
tion, which diagonalizes H, is particularly important, and we shall consider many 
examples in the next section. Note that friction effects are beyond the scope of this 
section and will be treated only in Sect. 4.4.3. 

With the Heisenberg equation, we can now determine the derivatives of expecta- 
tion values with respect to the time, taking time-independent states as the basis: 


d(A) i a (A) 
ae ae 


If we use here H = P? /2m + V (R) and determine the derivatives of (R) and (P) 
with respect to time, then ( [P?,R]) is important in the first case and ( [V (R), P ] ) 
in the second. Now [P?, X] = [P,?, X] = —2ih P,, and in addition (according to 
p. 316), [Ff (X), P] = ihf’(X) holds. Consequently, the following equations are valid: 


——=+"* and =(-VV) = (F). 


Thus the expectation values satisfy the equations of classical physics, which is known 
as Ehrenfest’s theorem, although (F(R )) does not need to be equal to F((R)). 


340 4 Quantum Mechanics I 


In order to see how the uncertainties in R and P change with time, we determine 


d((R-R)—(R)-(R)) | (P-R+R-P)—2(R)- (P) 
dt 7 m h 
d ((P - P) — (P) - (P)) 


dt 


= (P-F +F- P) —2 (P). (F). 


For a constant force (e.g., in the free case), we have (P- F) = (P) - F = (F - P). 
Thus then the momentum uncertainty remains constant, and for sharp momentum, 
so does the position uncertainty. 


4.4.2 Time Dependence: Heisenberg and Schrödinger 
Pictures 


Inthe last section, we started from the so-called Heisenberg picture. In the Heisenberg 
picture the observables depend on the time, but the states do not: 
© Au = = [Ma An] + Č An, < m) = lo) 
ss = — f — F —_ = |0). 
T pn oe ee a da 
To solve the Heisenberg equation, we search for a time-dependent unitary transfor- 
mation U which connects the operator Ay (A in the Heisenberg picture) with an 
operator As (A in the Schrödinger picture), which does not depend upon time: 


+ i dAs 
As = UAyq U š with a T Ne 


Hence the Heisenberg equation delivers 


dU ‘ i OAH) + du" 
0 = TAn Ut + U (= [Hy An) +=) Ut + UAy —. 
ap + z Mu u] + ry + UAn J7 
If we restrict ourselves to observables which depend on time only implicitly, whence 
ðAy/ðt = 0, then this condition can be satisfied for all operators Ay if the unitary 
operator U satisfies 


ee eg. es. ME E 

ie he dP ieee 
Here the zero times coincide in the two pictures: U (0) = 1 or Ay(O) = As. Both 
requirements are satisfied by the time-shift operator 


—i Hyt 
U (t) = exp z? 


4.4 Time Dependence 341 


if Hy does not depend on time (otherwise we still have to integrate, as we shall see 
in Sect.4.4.4). Note that, since p. 317, we already know of similar position- and 
momentum-shift operators. The Hamilton operator in this situation also commutes 
with U, and hence Hy = Hs = H. We shall now restrict ourselves to this case. 

In addition, from |g) = U |Wy), we can say that: In the Schrodinger picture the 
states do not depend on time, but the observables (Schrédinger equation) do: 


d d i 
qos q Ys) = Tg H Ys) - 
In general, differential equations for Hilbert space vectors are easier to integrate than 
those for operators (the Heisenberg equation). Hence, we shall work mainly in the 
Schrödinger picture and leave out the subscript S. In particular, we then have, in the 
real-space representation, 


h avr) 
i 


H = 
pe aes 


where H(R, P) = H(r, —iħ V ) is to be taken. This equation is similar to the 
Hamilton-Jacobi differential equation of p. 135, viz., 


aw 
zr res VW)=0, 


if Hamilton’s action function W = f Ldt is replaced by —ih y, with Planck’s action 
quantum h = 27h. However, instead of V W - V W, we have not -K#V y - VY, but 
rather -° V - Vw = -R Aw. 

If we restrict ourselves to particles of mass m and charge q in an electric potential 
®, then the time-dependent Schrodinger equation (in the real-space representation) 
reads 


9 K 
ih ya r)= (- A+ væ)) Yt, r), 


with V(r) = q ® (r ). If we consider the wave function associated with an eigenstate 
of H with the sharp energy En, where the zero energy may be chosen arbitrarily (a 
different zero leads only to a new time-dependent phase factor in the wave function, 
which will not affect the experimental value), we have 


—i 


Ent . 
Ph), with Yn) = Wn, r), 


n(t, = 
V(t, r) = exp — 


and it only remains to solve the time-independent Schrödinger equation (in the real- 
space representation) 


342 4 Quantum Mechanics I 


2 
Ey Walt) = (— 5 avo) alr). 

m 
For a magnetic field, instead of V and in addition to q ®, further terms are still to be 
considered, as was shown in Sect. 4.3.7. Since for all states with sharp energy, the 
time appears only in the phase factor, which does not affect the expectation values, 
they are called stationary states. 

In the Schrédinger picture, if we transform with any time-dependent unitary oper- 

ator U, we obtain 


1 : : d : d 1 1 1 
IY) =U |v), with ihg lw) =H IY) and ihg Y) =H lyr) 
and clearly also ih (U|y) +U|w)) = H'U |y), or 
, aa dU na š 
H =iħ-; U'+UHU' ; 


An example of an application is the unitary transformation to H’ = 0, which clearly 
results in ih Ù = —UH, or U = exp(iHt/h) (if H does not depend on time). This 
corresponds to the transition from the Schrödinger to the Heisenberg picture, the 
states of which do not depend on time. 


4.4.3 Time Dependence of the Density Operator 


The density operator turns out to be useful also for the time dependence. In particular, 
we may also use time-independent expansion bases in the Schrödinger picture if the 
density operator takes care of the time dependence. In the Heisenberg picture, it does 
not depend on time. 

According to p. 312, unitary transformations do not change expectation values. 
Hence for the time dependence, the notation 


(A) = tr{U (t) pu U" (t) As} 


is to be preferred, since py and As do not depend on time, and in addition to 
Ut (t) As U (t) = Ap (t), we have 


ps(t) = U(t) UŻA) . 
We can read off from this that the density operator ps(t) and the observables Ay (t) 


depend oppositely (contravariantly) on time. With p = U ppU” (leaving out the sub- 
script S), we have the von Neumann equation 


4.4 Time Dependence 343 


de _ [H, p] 
d ih ` 


The equation do/dt = 0 in the Heisenberg picture corresponds classically to the 
Liouville equation (see p. 129) do/dt = 0, which is then reformulated as do/dt + 
[o, H] = 0, because the classical probability density pọ (in phase space) depends 
upon further variables in addition to t. The density operator depends only on time, 
the other variables being selected only with their representation. Hence, it does not 
make sense to write the von Neumann equation (as an operator equation) with the 
partial derivative do /dt = [H, p]/ih. 

In the energy representation, that is, with H |n) = |n) En, (n|n’) = day, and 
>, 12) (n| = 1, the von Neumann equation implies 


—i (En = Ew) t 


(n| e (®©) |n’) = (n| p (0) |n’) exp i 


Only the energy differences are important here—the zero of the energy does not 
affect the density matrix. 

According to the von Neumann equation, none of the expectation values of powers 
of p depend on time, since d(o”) /dt « tr(p” LH, p]) always vanishes. This does not 
lead to arbitrarily many invariants, but to exactly N constants of the motion in an 
N-dimensional Hilbert space (the normalization condition (o?) = (1) = 1 counts 
here). In particular, the purity of a state remains (trp”), something that is changed 
only by dissipation (see Sect.4.6), and this cannot be described with Hamiltonian 
mechanics. 

The von Neumann equation becomes rather simple for doublets. For these, accord- 
ing to pp. 309 and 312, we have 


1 trH + o0- tr(o H) 1+o-(o) 
H= 5 and a a 


We thus search for d( o) /dt = tr(o do/dt). Now tr(o [H, e]) = (Lo, H]). The com- 
mutator of o with ło - tr(o H) is thus important, and according to p. 325, this can 
be derived from the expression itr(o H) x o. Hence we obtain in total 


d (0) axa io tr(o H) 
= hia OS wi = 
dt , h 


as for the motion (see p. 92). A well known example is the Larmor precession of a 
magnetic moment in a magnetic field, where H = — usg © - B appears as the Hamilton 
operator in the Pauli equation (p. 327), whence tr(o H) = —2 up B. 

For the Larmor precession, ( o) denotes the spin polarization. But in general we 
may also understand | +) and | |) as states other than those with m; = +3. We then 
speak generally of a Bloch vector ( ©}. According to p. 308, with WW* + WTW = 1, 
we then have 


344 4 Quantum Mechanics I 


H=W'W Hy +A +Y Hy + YY Ay 


Ay + Ay Ay + Any Ay — Hy Hm — Ay 
=1 + 6. + Oy + 0; . 
2 x a 4 2i i 2 

With tro = 0, troo; = 2e;, and Hy, = Hy", we obtain 
Hy — Ay 
tr(oH) =2 ezReHys + ey Im Hyg + e ——— 7 =hQ, 


and this vector determines the precession of the Bloch vectors ( ©) in a space whose 
z-component contains information about the occupation of the states |) and |{). 
Here AQ tells us how much the two energy eigenvalues differ from each other, 
which follows from det(H — E) = 0. According to p. 309, we have in particular, 
E= 1 (trH + hQ), where AQ is the square-root of (trH)? — 4 det H = (Ay — 
H)? + 4H y4. 

The considerations can be transferred from 2 to N dimensions of the Hilbert space, 
if, according to Sect. 4.2.5, we start from a basis {C,,} of time-independent Hermitian 
operators. In particular, according to the von Neumann equation (see p. 313), 


1 
tro? = — C)? 
rp oO ) 


is conserved, and for Co = /c/N 1, so is (Co). The Bloch vector with real compo- 
nents (C1), ... has the same length at all times. Here, according to the von Neumann 
equation, we have 


d(C) 
dt 


t iH Chi Cr 
= X Qa (Cw) , with RQ wy = r( [ D , 


c 


and for Co « 1, we may restrict ourselves ton # 0 Æ n’. If H does not depend on 
time, then neither do any of the coefficients Q,,,, of the system of linear differential 
equations. Since they are all real and form a skew-symmetric matrix, 


nw = Qnn” = -Qyn , 
their eigenvalues are purely imaginary and pairwise complex-conjugate to each other. 


The von Neumann equation also yields the time dependence of the Wigner func- 
tion from Sect. 4.3.5: 


_ 1 Suc 1 1 +2ip-r’ 
perp) =a, far PSE Oy) exp — 

= ae | Er wp’ ptp’) exp 

= Gp J ÈP P—P'le@ |p +p’) exp— 


4.4 Time Dependence 345 


With (p — p'| [P?, p] |p +p’) = —4p-p’ (p—p'| p |p +P’), we have in par- 
ticular [P?, o]/ih = —2p- Vp, while on the other hand, if V depends upon the 
position only locally, i.e., if we have (r|V|r’) = V(r) d(r — r’), then 


0 t,Y, 
a LE TT 

Ot m 

il ferwe r’)— V(r+r’)} (r-r'| p [rt+r’) aoe 
= "N a 2 ex 

= p P 


For a harmonic oscillation, the right-hand side can be traced back to the expression 
VV ; Vp p(t, r, p), i.e., to the gradient of p in momentum space. With p/m = v, we 
thus have in the harmonic approximation (and naturally also for the free motion with 
F=0), 


ð 
(< +v-V,+F- Vp) oer p) =0. 
This is the collision-free Boltzmann equation, which holds quite generally in classical 
mechanics (and also for other potentials, see Sect. 6.2.3), where p(t, r, p) is then 
the probability density in phase space. 


4.4.4 Time-Dependent Interaction and Dirac Picture 


In addition to the Heisenberg and Schrödinger pictures, there is also the Dirac pic- 
ture, often called the interaction representation, used in particular in time-dependent 
perturbation theory and scattering theory. There the Hamilton operator is split into a 
free part Ho and an interaction V, viz., 


H=H+V, 


where Ho does not depend on time—otherwise the following equation would have 
to be generalized, as will be shown later. If we set 


—i Hot 
Uo (t) = exp A” 


then for H ~ Ho, U ~ Ug is also valid, at least for time spans that are not too long. 
Under the interaction representation, we now understand 


lWp(t)) = Ut (0) [Ws(t)) = Uo! (t) UO |Wu) , 


and correspondingly 


346 4 Quantum Mechanics I 
Ap = Up'As Uo = Uot U Ay UŻ Up . 
Hence it follows that d|yp) /dt = i~! (Ho — Uo'H Uo) |Wp), so with Hyp — Hp = 


— Vp, we find 


d 1 d 1 
T IVD) = = Vo|Wp) and — Ap= > [Mo, Ap]. 


In the Dirac picture the time dependence of the observables becomes fixed by Ho and 
that of the states by Vp. 


If we set |Yp(t)) = Un(t) |Wp(0)) with |Yp(0)) = IYs(0)) = |v), then we 
obtain 


Up(t) = Uo ()U(t) => UA = Udit) Vlt). 
Clearly, with iñ~! Uot (Hy — H)U = —ih~!Vp, we have the differential equation 


dUp i 
— =— = Vp(t) Up(t) . 

J z O VO 
To integrate this, we have to respect the order of the operators—an operator at a later 
time should only act later and thus should stand to the left of operators at earlier 
times. This requirement is indicated by the special time-ordering operator T: 


Up(t) =T exp(—+ | dt! vot’) l 
0 


The derivative of T exp ie A(t’) dt’ with respect to f is equal to A(t) times the expres- 
sion to be differentiated. In addition, we have Texp(0) = 1. We thus obtain the 
integral equation 


t t A 
T exp f dt A(t’) = i+ f dt’ A(t’) T exp | dt” A(t’) , 
0 0 0 


which can be solved step by step: 


t t t A 
T exp f dt’ A(t’) =1+ f av a) + f ar f dt” AMN AMN +. 
0 0 0 0 


In the term of nth order, there are n time-ordered operators A. This expansion is used 
in time-dependent perturbation theory. Terms higher than the first contribution are 
usually neglected. 

For density operators, we have the equation (A) = tr(pA) in each picture. With 
As = UpApUp' and Ay = UŤAsU = Up*ApUp, we thus find 


4.4 Time Dependence 347 
pp = Uo" ps Uo = Up pu Un’ , 
which leads to the differential equation 


dpp i 


= Vp, : 
EF 7, [VD PD] 


With the series expansion for Up(t), we thus obtain 
i t 
po(t) = pp(0) — =f dt’ [Vp(t’), pp(0)] 
0 
1 t t 
-ga fat f A Wot). Voe, eo + 
0 0 


Instead of [V’, [V”, e]], we may also write [V’, V” o]+h.c., where h.c. stands for 
the Hermitian conjugate, because the operators are Hermitian. 

Time-dependent perturbation theory leads to Fermi’s golden rule for the transition 
rates. However, the procedure is often superficial. We shall go into more detail when 
we derive the golden rule in Sect. 4.6. 

An exact treatment without approximations can be found for the time-dependent 
oscillator. This was already done for the classical case in Sect. 2.3.10 and especially 
in Sect. 2.4.11. In particular, the Hamilton operator 


Hi) = 1 P? ye 2y2 
— 2 \2n' 2” 


leads to the eigenvalue problem of the usual (time-independent) oscillator of mass 
m and angular frequency w. The time dependence is contained here in the classical 
function & (t) and thus involves no time-ordering problem—and the time-independent 
oscillator has eigenvalues hw (n + i), as will be shown on p. 359. But, as already in 
classical mechanics (see Sect. 2.4.11), these values for the —m f (t) X with the force 
f are not the energy 


(P —mF XxX) m- 
= + 


E fX’, with F=f—f and F=0. 
2m 2 


In addition to the eigenvalues of H (t), Fig.4.11 shows the expectation values of 
the energy with respect to the eigenstates of H. However, the energy uncertainties 
are very large, and the values actually overlap in the right-hand picture. At least it 
becomes clear that the eigenvalues do depend on time, although the energy barely 
does so. In many cases, these properties of a time-dependent interaction are derived 
only in the adiabatic approximation (for sufficiently slow changes). 


348 4 Quantum Mechanics I 


Fig. 4.11 Eigenvalues of H H(t) E(t) 
(left) and expectation values í j 
E = (n|E|n) (right) for a NS 

time-dependent harmonic NE 

oscillator (both in the same 

arbitrary unit). Here a = 1/2 

and q = 1/4 was chosen in 


the Mathieu equation. For 
t = 0 it is force-free 


4.4.5 Current Density 


For stationary problems, an expression for the probability current follows from the 
time-dependent Schrödinger equation. Since the total probability is conserved (it is 
equal to 1), according to p. 187, we have the continuity equation 


ap 

—+V-j=0, 

ðt a 
where p is the probability density |Y (t, r)|*. Hence, from the Schrodinger equation, 
we obtain 


op » OW 


ays wWHy-wHy 
at = . ’ 


aA ot ih 


and with 


P- P-A, 
2m 


H ® bA 


for the Coulomb gauge (i.e., with P - A = A - P) with P - P S (—iñ)? A and —A - 
P = ihA- V, we conclude 


a * Ay — y Aw* *V V y* 
p p WAV WAN VY 
ot 2m m 
Here, according to p. 16, the first numerator is equal to V - (Y* Vw — y V y*) and 


the second is equal to V (w* yr). Hence, assuming the Coulomb gauge, the probability 
current density is given by 


ENS EN, Oe 


1 2m m 


4.4 Time Dependence 349 


Note that, for real wave functions, only the last term contributes. With AV y = iPw 
andhVw* = —i(Pw)*, together with A = A *, and in the real-space representation, 


this is equivalent to 
EA „P—qA 
j = Re| y* ——_ yl. 


m 


Here, classically, (p — gA)/m is the velocity for a point-like particle of mass m and 
charge q, and y*y is the probability density. For the electric current density, we 
obtain qj. 

For spherically symmetric problems, we prefer to take the wave function 


Wnt) = i vO). 


Uni (r) 
r 


with the spherical harmonic Y ® (Q) of p. 335, which is real up to the factor exp(img). 
Note that the radial functions u,; are real for bound states, but complex for scattering 
states, as we shall see in Sect.4.6. If we call the mass mo, in order not to confuse 
with the directional quantum number m, and refer to p. 39, then using 


ven te ae ! 2 
~ or? 5 a0? rsind ag” 


it follows for bound states that 


= mh |Win (|? 
— P m rsing 
The term in A is missing here, because we have restricted ourselves to spherically 
symmetric potentials. For bound states and eigenstates of the orbital angular momen- 
tum L, there is only a probability current along the L -axis, if m Æ 0. 

For electrons, however, the spin (and magnetic moment) have to be considered. 
We should take the Pauli equation from p. 327 as the Hamilton operator. Hence, 
noting that electrons have negative charge q = —e, we start with 


H = Ho + ug B.o. 


Since o appears here, we use the spinors 


af T= * * 
va (5) = ww vp. 


and find the equations 


350 4 Quantum Mechanics I 


ə 

in Hoy tusB-ov, 
awt : 

-in = ty ytu yB- o. 


Note that Hp acts like the unit operator in the spin space, but generally changes y in 
the position space, whence we have Ho yt and not yt Ho in the last row. 

If we multiply the first equation on the left by yt and the second on the right by 
w, then subtract one from the other, it follows that 


0 3 m Pa 
ih Wit) = Y Ho y — (Ho y’) y. 


Hence for the probability current density j , we obtain nearly the same expression as 
previously. Instead of Y* (P — qA ) w, it now reads 


Wi P -q4A) Wy +y} P- qA) y. 


For the electric current density, we should now not only take qj (with q = —e for 
electrons), but also consider the magnetic moments, and according to p. 192, amend 
V x M, using M = —upg W' o y for electrons. 


4.4.6 Summary: Time Dependence 


The time dependence is determined by the Hamilton operator. Then we distinguish 
between the Heisenberg and Schrödinger pictures, depending on whether only the 
observables or only the state vectors depend on time, respectively. In the Schrödinger 
picture, we have the time-dependent Schrödinger equation 


diy) 
ih- TEW), 


and in the Heisenberg picture, the Heisenberg equation 


dA i HA ðA 

de el ay 

which can be looked at as the quantum generalization of the classical equation 
da/dt = [a, H] + da/dt (p. 124). We may also take observables and basis vectors as 
constant and describe the time dependence by the density operator. This then obeys 
the von Neumann equation (in the Schrödinger picture), viz., 


dpi 


= H, ’ 
i z ol 


4.4 Time Dependence 351 


which is the generalization of the Liouville equation to quantum theory. 
Stationary states have a well-defined energy. Hence, if H does not depend on time, 
they are eigenstates of the Hamilton operator: 


Alyn) = Wn) En , 


and in the Schrödinger picture they contain the time factor exp(—iE,,t/h). This leads 
from the time-dependent to the time-independent Schrédinger equation (the last 
equation), which in the real-space representation has the form 


H 
(- om A+ vr)) Phir) = En, Wilt ) > 
m 


since with P = (h/i)V,T = +P - P turns into T= — +h’ A. For particles with spin 
in a magnetic field, special terms also appear for the potential energy V. 

If the problem cannot be solved for the full Hamilton operator H, but for the time- 
independent approximation Hy = H — V, a perturbation theory is possible, using 
the Dirac picture. Then Hp determines the time dependence of the observables and 


Vp = Uo' V Uo that of the states. 


4.5 Time-Independent Schrödinger Equation 


4.5.1 Eigenvalue Equation for the Energy 


In this section we search for the eigenvalues E,, and eigenvectors |n) of the Hamilton 
operator H for a given interaction. We deal with the equation H |n) = |n) E, and 
assume that H has the form T + V with the (local) potential energy V (r ). (We shall 
treat special cases, and in particular a magnetic field and also particles with spin F, 
at the end. The exchange interaction in the Hartree-Fock potential is nonlocal, as 
we Shall see in Sect. 5.4.2.) Actually, V is an operator which is fixed in the real- 
space representation by (r| V |r’). But for the local interaction here, we can write 
V(r) 67 —r’) and V(r) y(r) instead of (r | V |y) = f&r (r| V [r')(r'|y). 

We shall usually take the real-space representation in order to make use of this 
locality of the interaction. From (H — E,) |n) = Oand (r |n) = W,(r) withP = — 
ih V, we obtain the differential equation 


e2 A+ V(r) — Eq) Wor) =0. 


This is not yet an eigenvalue equation though, but only a partial, linear, and homo- 
geneous differential equation of second order (for which the value and the gradient 


352 4 Quantum Mechanics I 


of the solution at a boundary can still be given arbitrarily in order to fix a special 
solution). 

But y,,(r ) will now be a probability amplitude, which means that the expression 
f d3r |Y (r )|? will be normalized to 1. However, we have also allowed improper 
Hilbert vectors, for which we have 


[or Vi") vv) =n- n), 
with continuous n and n’. But for discrete values we require 


[br ve) vee) = dun 


which can only be satisfied for special energies, as will be shown soon. 
In order to make that clear, we restrict ourselves to the one-dimensional problem, 
i.e., to a standard differential equation, and consider 


ji 2m 
w(x) + A {E -Va yax) =0. 


If V(x) decreases faster than |x|~! for large |x|, so that E — V —> A?k? /2m, the 
asymptotic solutions exp(+ ikx) for k 4 0 can be superposed linearly. For k? > 0, 
they oscillate and we can normalize in the continuum. But fork? < 0, we can only take 
exp(—|kx]|), since exp(+|kx|) is not normalizable. For E < V(x), all wave functions 
have to vanish exponentially forx — -boo, with specific dependence according to the 
differential equation. This is possible only for appropriate (countable) eigenvalues. 

These considerations are also valid for the case in which V (x) behaves asymptot- 
ically as |x|~! (which requires an amendment « iIn |kx| to the exponent). The sign 
of E — V is decisive, also in three dimensions. 


4.5.2 Reduction to Ordinary Differential Equations 


We shall only consider potentials whose variables can be separated, i.e., potentials 
which can be written as a sum of terms, each of which depends on only one variable. 
Then the partial differential equation can be separated into three ordinary ones and 
solved much more easily. 

Suppose for example that V(r) = V(x) + V (y) + V(z). Then the product ansatz, 
with each term involving just one Cartesian coordinate, i.e., (x|mx)(y|ny) (z|nz) (and 
energy E, separating into three terms) provides a way forward. In this way, the given 
partial differential equation can be reduced to three ordinary ones of the form 


d? 2m 
(<5 + FS En- VO) bln) =0. 


4.5 Time-Independent Schrödinger Equation 353 


If we multiply this equation by (y|ny)(z|n,) and add the corresponding equations in 
the variables y and z, then with E, = En, + En, + En,, we have the original partial 
differential equation. If at least two of these potentials are the same, then degeneracy 
arises and the different equations result in the same eigenvalues. 

For a central potential V(r) = V(r) spherical coordinates are usually more appro- 
priate than Cartesian ones. As is well known, the Laplace operator in spherical coor- 
dinates reads (see p. 39) 


1 32 1 1 3 a 1 a? 
Ay = | ng Jv. 
X r ðr? 7s r? lsin@ 30 (sin z) T sin? dy? á 


According to p. 335, the eigenfunctions of the operator in the curly bracket are the 
spherical harmonics, with the eigenvalue —/ (1+1). In classical mechanics, for a 
central field, we also made use of the angular momentum as a conserved quantity 
(p. 142). We thus set 


Un (r) . 
Wnim(Y, 0, p) = n i! YOQ) ; 


where m is the directional quantum number, and obtain the radial equation 


d 1(l+1) 2m . 
(E-r tiln- VO}) w0, with uO) = 0 , 
with m the mass once again. This boundary condition requires Wj, to be differentiable 
at the coordinate origin, since we have divided by r. The further boundary condition 
un — O for r > œ is still required for the normalizability of the bound states. It 
leads to an eigenvalue equation for the energy. Note that these eigenvalues no longer 
depend on the directional quantum number m. The spherical symmetry leads to a 
21-fold degeneracy, i.e., there are 2/+ 1 different eigensolutions with equal energy. 

Near the origin, for l 4 0, the second term usually outweighs the other ones and we 
have u” — l (1+ 1) r~? u © 0. This differential equation has the linearly independent 
solutions r~! and r’*+!, Only the second vanishes at the origin (also for / = 0). Hence, 
we usually set u, in the form un (r) = +l fir). 


4.5.3 Free Particles and the Box Potential 


For free particles, the Hamilton operator consists of only the kinetic energy P?/ (2m), 
so we use the eigenfunctions of the momentum, or indeed of k = p/h, from 
Sect. 4.3.3: 


P? E hk? d. exp(ik -r ) 
e => — FRR an = — . 
2m k 2m k Dr 3 


354 4 Quantum Mechanics I 


There we also saw that f dr Wi (x) yy) = (k — k’) in that case. 

The sharp wave vector k (and the sharp energy Ex) are idealizations. Actually, 
for these continuous variables, we should consider their uncertainty and hence take 
a superposition of terms with different wave vectors, a so-called wave packet. The 
energy uncertainty means that we cannot simply split off a factor exp(—iwt), but we 
only have 


1 
J2n 3 


v(t, rn) = J Skv expt roan), 


because w = hk?/(2m) depends upon k. If only wave numbers from the near 
neighborhood of k contribute, then the group velocity of this wave packet, viz., 
(dw/dk); = hk /m = p/m = 7, is twice the phase velocity @/k. Hence, in the course 
of time, the wave packet changes shape. If we take, e.g., a Gauss function for ẹ (k ), 
as on p. 321 (the smallest possible uncertainty product Ax(0) - Ak = 1/2), then the 
position uncertainty increases with time: 


Ax(t) = Ax(0) V1 + {2h (Ak)? t/m? = V{Ax()}* + {Avt , 


since Ax(0) - Ak = 1/2, while ¥ moves with the velocity V = hk/m. 

A further example is that of a box with impermeable walls. Here the probability 
density may differ from zero only inside the box. Outside the container, the wave 
function must vanish, since the time-independent Schrédinger equation makes sense 
only if V(r) y (r ) is finite everywhere. In addition, the wave function must also be 
differentiable, thus continuous everywhere. This allows only a countable sequence 
of energies. 

In the one-dimensional case, with V (x) = 0 for0 < x < a, otherwise infinite, the 
boundary conditions w(0) = 0 = w(a) and the normalization to 1 fix the eigenso- 
lutions up to a phase factor. For n € {1, 2, ...} and the abbreviation 


and with Y,” + ka? Yn, = 0, we have 


2°, ihe 
W(x) =,/— sink,x , for0 <x <a, otherwise zero, E, = 5 : 
a m 


There is no normalizable solution for n = 0, and negative integers n deliver no further 
linearly independent solutions (see also Fig. 4.12). 

Correspondingly, for a cuboid in three dimensions with side lengths a,, ay, az, if 
we have k; = n;m /a; with n; € {1, 2, ...}, 


4.5 Time-Independent Schrödinger Equation 355 


Fig. 4.12 Energy eigenvalues and eigenfunctions of a box potential with infinitely high walls. The 
figure shows the potential and also the eigenvalues as horizontal lines. Each of these lines serves as 
an axis for the associated eigenfunction, where functions with even n are plotted with continuous 
lines, and those with odd n as dashed lines 


18 
Wn) = yv sink,x sin kyy sink,z, for 0 <x <a, etc., 


R (ke +k? +k?) 


E, = 
2m 


For a cube (a, = ay = az), there is degeneracy due to the symmetry, since we can 
permute ny, ny, n, with each other and obtain the same energy value E,, « n? = 
ne + ny? + nz. In addition, there are also accidental degeneracies. For example, 
the state (nx, ny, nz) = (3, 3, 3) and the three states (5,1,1), (1,5,1), and (1,1,5) have 
the same energy, because here n? is equal to 27 for each. 

The potential discussed here is used for the Fermi gas model. In this many-body 
model, we neglect the interaction between the particles and consider only the quantum 
conditions, which stem from the inclusion of the particles in the cube volume. In 
contrast to the classical behavior, only discrete energy values (and wave numbers) 
are allowed. For such a gas we also need the number of states whose energy is less 
than an energy bound called the Fermi energy: 


i ay 

2m 
Then clearly, n? < (akg/s)*. Hence contributions come from all points with positive 
integer Cartesian coordinates inside the sphere of radius akg/z. For sufficiently large 
akp, the number of states is 


1 4r sakp\3 V V /2mEn\3/2 
ees 
8 3 \x 672 67? h2 


According to the Pauli principle, for spin-1/2 particles, each of these states can be 
occupied by two fermions. 


356 4 Quantum Mechanics I 


If we search for the bound states, with negative energy eigenvalues E,, < 0, in 
a box of finite depth Vo and width a = 21, i.e., with V(x) = —|Vo| for —1 < x < l, 
otherwise zero, then with the real abbreviations 


2m 
Po |El and kn = A CI Vol a |En|) ’ 


the differential equations Y” — «,2 Y = 0 for |x| > Land Y” +k? y = 0 for |x| < l 
imply a set of even states {y4 (x) = Y+ (—x)}: 


exp{k (l +x)} for x<-l, 
plx) x 4 æ cos(k,x) for —l<x< +l, 
exp{kn(l —x)} for I<x, 


and a set of odd states {w_(x) = —w_(—x)}: 


+exp{k (l +x)} for x<-l, 
y(x) « 4 B sin(k,x) for —l<x<+l, 
— exp{k,(l — x)} for I<x. 


The wave functions and their first derivatives have to be continuous everywhere, oth- 
erwise a differential equation of second order does not make sense. (In the present 
case, the second derivative jumps twice by a finite value. For the previously consid- 
ered infinite potential step, however, the second derivative changes so considerably 
stepwise that even the first derivative jumps there.) At the limits x = +/, these prop- 
erties fix œ and ĝ and also require as eigenvalue condition that k,/k, be equal to 
tan(k,/) for the even states and — cot(k,/) for the odd states. These requirements 
with z=k,l = + kna, C= mi? |Vo|)!/ L, and Ke fkn = t? /z? — | are easier to 
solve, if we satisfy (starting with n = 0) 


. . Z 1 
even eigensolution |cosz| = z for nw <z<(n+ x) T, 
odd eigensolution | sinz| = 7 for (n + 5) m<z<(n+l)z. 
From z = k,l, it follows that E, = —|Vo| (1 — z? /¢?). For finite Voa7, there are also 


only finitely many bound-state eigensolutions, namely at most 2¢/z (see Fig. 4.13). 
For the unbound solutions (“continuum states” with arbitrary E > 0), the potential 
can be attractive or repulsive: 


V(x)= V), for —a<x <0, otherwise zero. 


Here we use the real abbreviations 


K = vV2mh-2E and k = V2mh-? |E- Vol, 


4.5 Time-Independent Schrödinger Equation 357 


Fig. 4.13 Eigenvalues for 
the box potential of finite 
depth. Solutions are the 
intersections (full circles) of 
the straight line z/¢ with the 
curves | cos z | (continuous 
lines) and | sin z | (dashed 
lines). Here, 

¢ = |2mVo|!/2 a/2h 


and let a wave come in from the left (x < —a). At the potential steps, it is partially 
reflected and partially refracted. For E > Vo, we then have 


A exp{iK (x+a)}+ B exp{—iK(x+a)}, for x<-a, 
voa cos kx + ik sin kx, for —a < x< 0, 
exp(iKx), for 0<x, 


with «x = K/k. Here use has already been made of the continuity of the wave function 
and of its first derivative at x = 0, and the factor for x > O was set arbitrarily equal 
to 1, while a common factor is still missing. The continuity conditions for x = —a 
require 


—1 —1 
A = coska -i | — sin ka and B= I sin ka . 


With the parameter ¢ = |2mVo|'/? a/2h, we have ka = 2¢ |E/ Vo — 1|!/". For E < 
Vo, k is to be replaced by ik (k by —ix) and we note that cos iz = cosh z and sin iz = 
isinhz. 

If the probability current density ja is refracted (transmitted), then the probability 
current density je = ja |A|? comes in and j; = ja |B|? is reflected. The transmittance 
D = ja/je and reflectivity R = j,/j. together sum to 1: D + R = (1 + |BI’)/|A[? = 1. 
We obtain (see Fig. 4.14) 


Fig. 4.14 Transmittance D D 
at steps of height Vo and 
width a as a function of the 
energy E for three values of 
the parameter 

€ = |2mVo|!/? a/2h, namely 
1/2 (green), 1 (blue), and 2 
(red). The classical case is 
shown with a /it dashed line 


358 4 Quantum Mechanics I 


Vo? sin? ka \-! 
( + a) for E > Vo, 
D= 4E |E — Vol 
Vo? sinh? kay -! 
(1 ie a) for E < Vo, 
4E |E — Vol 


While for E < Vo nothing is refracted classically, according to quantum theory, the 
tunnel effect occurs because the uncertainty relations have to be observed. Due to 
the position uncertainty, the finite length a does not “really” act, and because of the 
momentum uncertainty, neither does the finite potential step height. In particular, for 
ka > 1 (and E < Vo), we have 


16E |E — Vol 
Dw 7 
Vo 


exp(—2ka) . 


On the other hand, for E > Vo, D = 1 classically, but according to quantum theory 
all is refracted only if E >> |Vo| or ka is an integer multiple of x . This is also shown 
in Fig. 4.14. 


4.5.4 Harmonic Oscillations 


We shall not determine the eigenvalues for linear oscillations here using their dif- 
ferential equation and boundary conditions, but algebraically, using some extremely 
useful operators. We have H = x P? + a œ? X?. With an energy unit iw, a momen- 
tum unit pop = /2hmo, and a length unit x9 = 2h/po = /2h/ma, this leads to the 
equation 


H XP P 
ho ~ x02 Po? ` 
If we now set 
Y+ y’ Y- y’ 
X = Xo and P= Po ’ 


2 


whence Y = X /xp + iP/po and Ut = X /xo — iP /po, then the commutation relation 
[X, P] = if 1 together with xo po = 2 h imply the equation 


[Ww ]=1, 
and in addition 


H = tho {U,V} = ho (WW + 4). 


4.5 Time-Independent Schrödinger Equation 359 


The commutation relation [V, WÝ] = 1 is known already from p. 302, in particular, 
for the creation and annihilation operators of bosons. From this commutation relation, 
we obtained there the eigenvalues of WtW. Hence we already know the energy 
eigenvalues of the linear oscillator: 


E,=ho(n+5), with ne {0,1,2,...}. 


The energies of neighboring states all differ by iw (see Fig. 4.15). This use of Bose 
operators makes it possible to treat oscillations as particles. The sound quantum is 
called a phonon, and the quantum of the electromagnetic field (the light quantum) a 
photon. 

The energy fiw/2 of the ground state, with n = 0, is called the zero-point energy. 
It is not zero, because otherwise position and momentum would both be sharp. But 
then the product of the uncertainties could be as small as possible. The expectation 
values of Y and Wt vanish in the ground state and so also do X and P. In contrast, 
for X? and P?, it is important to note that (Wt + Y)? = + WWt = +1. We thus have 
AX = 5X0 and AP = ipo, so their product is equal to sh, and hence as small as 
possible. 

According to p. 128 the Hamilton function of a point charge in a magnetic field 
can be transformed canonically to that of a linear oscillation with the cyclotron 
frequency w = qB/m. Quantum mechanically, we then find the energy eigenvalues 
(Landau levels) with equal distances. However, degeneracy should be noted, as for 
two-dimensional isotropic oscillations. 

According to p. 321, we already know the wave functions of all states with the 
smallest possible product of the uncertainties AX - AP: these are the Gauss functions 
normalized to 1. Consequently, for the ground state we have 


n Jf 2/1 a —x? 
x)= —S is 
i Xo P o? 


Let us now turn to its remaining stationary states, i.e., those with sharp energy. 
According to p. 302, their eigenfunctions can be can built up with the creation 
operators WÝ from the ground state: |n) = (n!)~!/? (W")" |0}. From there, we have 
Wt Sx/xo — 4xod/dx. With s= J/2x/x) =xV/mo/h, this becomes t= 
2-1/2 (s — d/ds). But we may also replace the operator s — d/ds by — exp(4s”) 
d/ds exp(— 437) and apply n times to Wo. Now we have Rodrigues’ formula for 
Hermite polynomials: 


n 2 dq” 2 
H, (s) = (—)" exp(s aa exp(—s*) . 


With 5(s— s’) = (x — x') xo/V2 and x9//2 = /h/mo, which implies |s} = 
|x)./h/mo, the result is (see Fig. 4.15) 


360 4 Quantum Mechanics I 


Fig. 4.15 Energy eigenvalues and eigenfunctions for linear oscillations. As in Fig.4.12, we show 
the potential and eigenvalues (horizontal lines). These lines also serve as axes for the associated 
eigenfunctions, both even (continuous lines) and odd (dashed lines). States with sharp energy are 
stationary. Only for uncertain energy do oscillations occur. This will be discussed in Sect. 5.5.3 
(see also Figs. 4.20 and 4.21). As a function of the displacement, the eigenfunctions oscillate in the 
classically allowed region, while in the classically forbidden regions, they tend to zero monotonically 
(tunnel effect) 


exp -5 s?) 
2" n! /T 
So we only need to know the Hermite polynomials. 


Clearly, Ho(s) = 1 and H; (s) = 2s. The other polynomials can be obtained faster 
than by differentiation, if we use the recursion formula 


Pals) = H,,(s) . 


An+1(s) = 2s H,(s) a 2n Ay_-1(s) . 


Before the proof, we derive the generating function of the Hermite polynomials: 


oe) 
t” 
exp(2st — °) = È An(s) — . 
nN: 

n=0 


We have exp{—(t — s)*} = >, a” exp{-(t — s)?}/dt"|,<0 t” /n! according to Taylor. 
Here the derivative up to the factor (—1)” is equal to the n th derivative with respect 
to s for t = 0, thus equal to (—1)" d” exp(—s”)/d”. Consequently, using the above- 
mentioned generating function, we may derive further properties of the Hermite 
polynomials. In particular, we only need to differentiate with respect to ¢ and then 
compare coefficients in order to prove the formula. For |s| >> 1, we also find H, (s) ~ 
(2s)". If we differentiate the generating function with respect to s, then H,’ = 2n Hy,_; 


4.5 Time-Independent Schrödinger Equation 361 


and hence H,” = 2nH,,_1'. If we use the recursion formula in the first derivative, we 
obtain the differential equation 


H,” (s) — 2s H,' (s) + 2n HH, (s) = 0. 


Written as a polynomial, we have 


[n/2] 
— 2k)(n — 2k — 1 
H, (s) = Xa (2s) j with oie! = u Y(n ) ; 
k=0 


ak k+1 


and dp = 1. Clearly, H,(—s) = (—)” H,,(s), so we also know the parities of the states. 
According to classical mechanics, there are oscillations only for T = E — V > 0. 


Hence, we would have to require sho (2n+ 1) > + mox’, or put another way, 


s$ =x? mo/h < 2n + 1. In fact, the Schrödinger equation for linear oscillations can 
be written in the form Y” (s) + (2n + 1 — s?) Y (s) = 0. Fors? = 2n + 1, the sign of 
Wn" therefore changes, without | yn]? vanishing for larger values of |s|. Moreover, in 
the classically forbidden region (with T < 0), there is still a finite probability density. 
We already met this tunnel effect in the last section. 


In three dimensions, for the isotropic oscillator, we have 
E, =(n+3)ho, with n= n +n +m € {0,1,2, ...}. 


Except for the ground state, all states are degenerate: ny and n, can be chosen arbi- 
trarily, as long their sum is < n, while n, is fixed. There are therefore i (n+ 2)(n+ 1) 
different states in the same “oscillator shell”. They all have parity (—1)”. 

Since a central field is given, we can also express the oscillation quantum number 
n in terms of the angular momentum quantum number / and the radial quantum 
number n,. There are always 2/+ 1 degenerate states of equal parity for each value 
of l. However, the isotropic oscillator is more strongly degenerate. Here n and / are 
either both even or both odd because of the parity. Their difference is an even number. 
In fact, we have 


n=2(n,-1)4+1. 


Here the radial quantum number n, starts with the value 1, as is usual in nuclear 
physics. We then have the following shells: 1s, 1p, 1d-2s, 1f-2p, 1g-2d-3s, and so on. 


4.5.5 Hydrogen Atom 


In the following we shall investigate only the bound states of a particle with the 
reduced mass m in an attractive Coulomb potential 


362 4 Quantum Mechanics I 


e? 


V(r)=— ; 
G) 4T E0 r 


and restrict ourselves therefore to negative energies—we have to consider the scat- 
tering off a Coulomb potential (EF > 0) separately, and we shall do this in Sect. 5.2.3. 

The standard example of this potential is the hydrogen atom, but where the mag- 
netic moment up is neglected. If we introduce the charge number Z, we also have 
the theory for hydrogen-like ions (Het, Lit*, etc.). To some approximation, even 
atoms with one outer electron can be treated. If the remaining core electrons can be 
replaced by a point charge at the position of the nucleus, then the considered outer 
electron is relatively far away from the core (it is said to be in a Rydberg state). 
Then, according to Rydberg, a quantum defect 5; can be introduced, and instead of 
the principal quantum number n, we have the effective principal quantum number 
n* =n— 6). 

The problem is centrally-symmetric. Hence, according to p. 353, the radial Schré- 
dinger equation 


| d? (L+ 1) a 


e2 
e nr) =0, ith u,,(0) =0, 
dr? r? h = 4T £0 zJ] ml) wa i 


remains to be solved. We take the Bohr radius ao and the Rydberg energy Eg, which, 
via the fine structure constant (see p. 623) 

o e et 

~ Ameo fico 137.0...’ 


can be derived from the length unit h/mco or the energy unit mco”, as becomes 
understandable in the context of the (relativistic) Dirac equation (Sect. 5.6.9): 


1A a? 2 K 
a = , ER = moo = >z 5. 
a mco 2 2mdo 


(We shall encounter the fine structure constant in Sect. 4.5.8 for the spin-orbit fine- 
splitting, which is where it gets its name. For hydrogen-like ions, it is Z times greater.) 
We set 


E= —Ex/n’ and r=nagp, 


where n will turn out to be the principal quantum number, and obtain the simpler 
differential equation 


( d? (+1) 


2 
o — -1+ —) mi(o) =0, with w0 =0. 
p p p 


4.5 Time-Independent Schrödinger Equation 363 


We could already have used the following solution method for the one- and 
three-dimensional oscillations. It is more cumbersome, but more generally appli- 
cable than the methods mentioned so far. Hence I will introduce it here, even though 
the Coulomb problem can also be solved with operators, which are related to Lenz’s 
vector (see, e.g., [6]). 

For large p, the differential equation takes the form u” — u = 0, with the two 
linearly independent solutions exp(+). Only the exponentially decreasing one is 
normalizable. In contrast, for small p, according to p. 353, we have u ~ p'+!. With 
these boundary conditions for small and large p, we set 


u(p) = p'*! exp(—p) F(p), with F(p) = > c o. 
k=0 


For the still unknown function F', the differential equation for u implies 


1@F I+1-p dF n—i-1, 


= 0, 
2 dp? p dp p 


and hence for the expansion coefficients c,, the recursion formula 


2 n—I-k 


Qas ¢ 
k 2l+1+k 


k-1 

The coefficient co is not yet fixed by the homogeneous differential equation. Its value 
is determined from the normalization. But the solution is normalizable only if we 
are dealing with a polynomial (with n, < oo), hence if the recursion terminates, 
otherwise we have in particular cy, /cy_, * 2/k, which corresponds to the function 
exp(2p), and despite the remaining factors, it is not normalizable. Hence not only 
must the radial quantum number n, be a natural number, but so must the principal 
quantum number 


n=n+l+1 e€{l,2,...}. 


F is thus a polynomial of order n,, and the energy eigenvalues are (see Fig. 4.16) 


ER 
E=- + with ne {1,2,...}. 
n 
Except for the ground state, all states are degenerate—and not only like for the 
centrally symmetric fields (where 2/+ 1 states have equal energy), but even more so. 


A total of 


n—-1 


Dya +1) =n? 


i=0 


364 4 Quantum Mechanics I 


0 10ao 20ao 30a9 r 


Fig. 4.16 Energy eigenvalues and radial functions of the hydrogen atom. The figure shows the 
potential, the first (degenerate) eigenvalues, and the associated radial functions, for / = 0 (contin- 
uous red lines), | = 1 (dashed blue lines), and | = 2 (continuous green lines) 


Table 4.2 Multiplicity of Coulomb states. Note that all these states are to be counted twice because 
of the spin 


States nl Is 2s-2p 3s-3p-3d 4s-4p-4d-4f 
—E,/ER 1 1/4 1/9 1/16 
Multiplicity 1 4 9 16 


different states belong to the energy En. In atomic physics, it is usual to give the 
principal quantum number n and the orbital angular momentum, using the letters 
indicated in Table 4.2. 

To determine the polynomials F, we use the variable s = 2p = 2r/ndo. Then the 
differential equation reads 


TE apo ye ae 1-—1)F=0 

s — —s)—+(n—-1- =0, 

ds? ds 

the solution of which is the generalized Laguerre polynomial Lo) (see, e.g., 


[7]). Other functions also carry this name: 


1 d(e) A (n+m\ (-s)* 
(m) = 2 Mas CORN 
Daa le eae e =}, n-k) KE ` 
k=0 


with the resulting eigenfunctions also shown in Fig. 4.16. As for the Legendre poly- 
nomials (p. 334) and the Hermite polynomials (p. 359), the first equation is called 
Rodrigues’ formula. It fixes the polynomial by a correspondingly high derivative of 
a given function. With the Leibniz formula 


4.5 Time-Independent Schrödinger Equation 365 
ae) _ s(n) aif ate 
dx” a \k dk dxn-k ’ 


the second expression follows from Rodrigues’ formula. It shows that it is indeed a 
polynomial of 7 th order. Before we prove that the differential equation is satisfied, let 
us also deal with the generating function of the generalized Laguerre polynomials: 


1 
(d= A+ SEP 


[0,6] 
t 
Z =P LPO", fol <1. 


It is easy to prove this. If we differentiate it with respect to s, then the left-hand side 
leads to —t $72} L0”+D (s) t”. Hence, comparing coefficients, we find 


AG qd”L 
+1 + 
Lor) = E3 => L™(s)=(-)™ a 7 


The generalized Laguerre polynomial L™ (s) is thus equal to the m th derivative of 
the Laguerre polynomials Ly4m(s) = 1 (s), up to the factor (— 1)”. In addition, the 
equation for the generating function holds for s = 0, since it is L (0) = (oo and 
this binomial coefficient also occurs for the Taylor series expansion of (1 — t)~”"~! 
in powers of t (for |t| < 1), because for arbitrary p and natural number n, we have 


C) _ p(p—1)---(p—ntl) _ ( re) 
nj} n! = n f 


Hence, ("+") = (—)" CER for p = n+m and the generating function is correct. 
If we differentiate it with respect to t and compare the coefficients, we obtain the 
recursion formula 


(n+1) L® (s) = Qn+m+1—s) L™® (s) — (n+m) L” (s). 


Its derivative with respect to s delivers, along with the recursion formula, 


Lo (s) = LOD (s) + L KOR 
(24+ DL E) = (ntl—s)L™(s) + (nm) LODO), 

sLOD(s) = (nt+m+1I)L™(s)  — (n+ i (8) , 

sLD(s) = (nt+m)L™ (8) + (s—n)L™(s), 


and the further recursion formula 


sL'"*)(s) — (m+s) L™ (s) + (n+m) L”? (9) =0, 


366 4 Quantum Mechanics I 


as well as s LTP + (s—m) L™ + (n+1) LTP = 0, which leads to the original 


differential equation 


q? (m) 
(s a +(mt1—s) Š +n)L" (s) =0. 


For the normalization and the matrix elements of R* , the following equation is impor- 
tant: 


a = m m nyn k m +d! 
[ we SL (8) Ey oso (TN) no 


0 l 


It can be derived from the generating function using i ds e~* st = k! and (;*) = 
(—)" e which is necessary also for k < m or k < m'. In particular, the gener- 
alized Laguerre polynomials with equal index m = m' in the range 0 < x < oo form 
an orthogonal system for the weight function exp(—s) s”: 


(m+n)! 


CO 
| ds exp(—s) s” L™ (s) L™ ($) = —— ôw .- 
0 n: 


Correspondingly, in the range —oo < x < oo, the Hermite polynomials form an 


orthogonal system for the weight function exp(—s7). 
Thus we may set 


s+! exp(—}s) LOH) (s), with s = —, 


Un (r) =c G 


with the still unknown normalization factor c, obtaining 


oo 
(R‘) = [Paras lw? rk = I dr |u|? r* , 
0 
according to p. 353 and p. 333 (or Problem 4.35). Hence, 
lel? = 44" (n—=1—1)!/{n"ao (n+)! , 


and for the ground state 


2 r 
ujo(r) = —= — exp—., 


/ag do P ao 


and generally 


4.5  Time-Independent Schrödinger Equation 367 
n 


10 a 


| 


0 100ap 200a, (R)+AR 


Fig. 4.17 (R) + AR depends not only on the principal quantum number n, but also on the orbital 
angular momentum /. Hence the error bars for the lowest / (=0) (red) and the highest / (= n—1) 
(black) are shown, and the associated (R) as a dot 


k 2 
(n, LR In, 1) = (=$) y (5) CES am) 
2 2n (n+l)! m m (n—1—1—m)! 


In particular, we have ao(R7!) = n~*, and hence, 
(V) = —e°/ (47 £0) (R7!) = —2 Eg n? = 2E, . 


With E, = (T + V), we have (T) = —4 (V), which also delivers the virial theorem 
(see p. 79) with a Coulomb field for the time average. For the average distance 
(R), we find 5 {3n? — 1 (1 + 1)} ao, and in particular, in the ground state, 3a) /2. The 
most probable distance is given by the maximum of |u(r)|*. The states with radial 
quantum number n, = 0 (and the highest angular momentum in the multiplet of equal 
principal quantum number) each have only one—at n? ao, in the ground state thus 
for the Bohr radius—while the probability densities |u(r)|? of the remaining states 
have n, secondary maxima (see p. 367) (Fig. 4.17). 

In Bohr’s atomic model, the centrifugal force cancels the Coulomb force between 
the electron and nucleus, 1.e., mv? /r = e? / (47 £0r°). Hence, T = -4 V and E = 
5 V = —ER ao/r. Here, according to Bohr, not all distances r are allowed, because 
the orbital angular momentum /, has to be a multiple of f, i.e., mvr = nh with 
n € {1,2, ...}. Consequently, according to Bohr’s atomic model, we have r = n? ao 
and E, = —Ep/n’. It delivers the same energy values as the Schrödinger equation. 
However, in Bohr’s model, all states have an orbital angular momentum nħ that differs 
from zero: s-states are not allowed, and n is not the principal, but the orbital angular 
momentum quantum number. In addition, Bohr’s atomic model assumes a unique 
orbital curve, and does not incorporate the position and momentum uncertainty. 


368 4 Quantum Mechanics I 


4.5.6 Time-Independent Perturbation Theory 


If, for given H, we cannot solve the eigenvalue equation (H — E,,) |n) = 0, thus 
cannot determine the eigenvalues £, and the eigenvectors |), then an approximation 
method often helps. In particular, if H = H + V andthe eigenvalues and eigenvectors 
of H are known, 


(H —E,) |) =0, with J MA=1 and (fi) = byw , 


n 


then we can expand the unknown eigenvector |...) of H for the eigenvalue E with 
respect to this basis and also determine the matrix elements (|H — E |r’). Using 
1s E|.. ts Ler E|...)=(|E, + V —E|...) = 0, together with 


=>, ln) , we obtain the system of equations 
GH+V-£OO..)+0 Vi fiil..j+...=0 
(1| V 10) (O|...) + (|E +V -EJI (|...) +...=0 


Numerical calculations can be performed only for finite basis states |7), thus only 
approximately. If we take only two (thus a doublet), then we have already determined 
the eigenvalues on p. 309: 


Ex, = strH + 5 AQ, 
where now the average value is half of 
tH = Ey + (0| V0) +E + (VII), 
and the square of the splitting is 
(RQ)? = Eo + (01 V (0) -E — GV DY +4101 VDP. 


The two eigenvalues E+ are always different for (0| V D # 0. With coupling, there 
is no degeneracy, but the effect of level repulsion (see p. 310). Note that, without this 
coupling, the original eigenvalues E, change by the expectation values (7 | v In) to 
E, + (ñ| V |n). The expansion coefficients (7|) have already been determined in 
Sect. 4.2.10. 

For more than two basis states, the eigenvalue problem can be solved in pertur- 
bation theory (or numerically, using the variational method explained in the next 
section). Here we try to solve for (E, — H) |n) = 0 with Œ, -H ) In) = 0. To deal 
with degeneracy, we take a new basis: if the eigenvalue of H is, e.g., g-fold, then 


4.5 Time-Independent Schrödinger Equation 369 


only the g-dimensional problem (H — E) |...) = 0 has to be solved, as was just 
discussed for g = 2. 

To derive |n), we avoid cumbersome normalization factors if we now require 
(n\n) = 1. The normalization can be changed again right at the end. Then we have 
(a| V In) = AH — HA |n) = En — Èn, or 


En = E, + (| V In). 


The matrix element follows from |7) (7i| V |n) = |7) (En — En) = (En — A) ñ. Here 
we use the mutually orthogonal projection operators 


P= and Q=1-P. 


Hence, P V |n) = (En — -Ñ In) and also PV = Ve Q V=H-H- QF, and con- 
sequently (E, — -Q V) |n) = (En — H) |7). If there is no degeneracy, the singular 
operator (En — H)~' can act from the left, since with the projection operator Q, no 
singular operator appears on the left. The state with H In) = |n) En is missing and 
hence the operator 1 — (En — H JIQ V is regular, while the unit operator appears 
on the right. Thus with the propagator 


G(E) = at 


we find {1 — G(E,) Fyn) = |n), and hence the representation 
In) = {1 — G(E,) VY f) , 


and the eigenvalue equation 


E, = E, + AVU — GED Vy"! M . 


This is the perturbation theory of Wigner and Brillouin. Unfortunately, in this result, 
the unknown quantity E,, also occurs on the right and is not easy to determine. But 
if we may expand in a geometrical series and the method converges fast enough, we 
may replace G(E,) by G(E,) and can immediately give E,,: 


The expansion is clearly good if the absolute values of the matrix elements of V are 
small compared with the energy-level separations IE, =f |. 

By the way, G(E,) is encountered instead of G(E,) in the > perturbation theory of 
Schrödinger and Rayleigh. With the abbreviation A, = En — E, = = (ñ| V |n), we have 
G(En) = {1+ GE,) An} G(En), since AT! (A — B)B-! = B~! — A`! 
delivers 

~! = {1 + A7! (B — A)}B™! . 


370 4 Quantum Mechanics I 
Hence, | — G(En) (V — A,) factorizes in the form 
(1+ GE,) An} {1 — G(E,) V}. 
For the inverse of {1 — G(En) V}, we may also write 
(1— GE) (V — An}! {1+ GE) An} , 


and so avoid G(E,). Since Q |7 ) vanishes, and hence also GŒ) An |n), we therefore 
have 


In) = (1- GE) (V — Ay! |) . 


With {1—G(E,) V -AD =14+ (1 -GE (V -AD GE (V — An), 
this can be reformulated as 


In) = (1+ {1 - GE, (V — A)! GE) V) Ir) 


The propagator is now taken for the known energy, although A, still contains the 
unknown energy, so once again there is no explicit expression for it. But at least this 
equation is easier to solve than the one from the perturbation theory of Wigner and 
Brillouin. Then we obtain, to third order, 


and only encounter nonlinear equations for still higher orders. To second order, we 
have the same result via both methods. 

For GV < 1, the quantum numbers mentioned in |7) are thus also approximately 
valid for |n}. To next order, however, other states become mixed in. The eigenvalues 
of operators which commute with Ñ but not with H are no longer good quantum 
numbers. 


4.5.7 Variational Method 


If the perturbation theory does not converge fast enough because no good approxi- 
mation H is known, then a variational method sometimes helps. It delivers first the 
ground state and after that also the higher states, if there is no degeneracy. 

Each arbitrary approximation |) to the ground state with the energy Eo delivers an 
expectation value (Y|H |Y) > Eo, since with the eigen representation {|n)} of H, we 
have (WIA |v) = >>, En nlw)/? > Eo, with Eo < E<... and 
X Maly) |? = 1. Consequently, we can take any other basis {|7)}, and with |y} = 


Er [n) (| yr) satisfy 


4.5 Time-Independent Schrödinger Equation 371 


S{(vlH ly) —E(wly) -D}=0, 


where F is the Lagrange parameter introduced to deal with the normalization condi- 
tion. In the framework of the finite basis {|7)}, it turns out to be the best approx- 
imation to the ground state energy. The expansion coefficients (7|y) are to be 
varied here. Since H is Hermitian, we can trace the variational method back to 
(Y| H — E |y) = 0. Note that this requirement for the matrix elements means that 
(H — E) |W) must vanish, since (y| is arbitrary. Naturally, the method leads more 
quickly to a useful result the better the basis {|7)} already describes the actual ground 
state with few states, but it should also be easy to determine (7|H|7’). 

If, in the finite basis {|7)}, we find the linear combination which minimizes 
(Y| H |v) with the additional condition (w|w) = 1, then within this framework the 
ground state | yo) and its energy are determined as well as possible. The proper ground 
state may still have components orthogonal to the real one. The first excited state 
then follows with the same variational method and the further additional condition 


(lwo) = 0. 


4.5.8 Level Splitting 


For the coupling of a magnetic moment m of velocity v with a centrally symmetric 
electric field E = —V9, the following expression was derived on p. 372: 


1 d® m-(rxv) 
3 : 


Q 


r dr co 


If we use here the potential ® = e/(4zr er), Weber’s equation cy 2 £o uo, the mag- 
netic moment m = —eS/mo for the reduced mass mọ (see p. 327), and 


r x v= l/m, 


then according to the correspondence principle—the transition to quantum mechanics 
with [L, R] = 0 is easy—we find 


uo e L-S  , a> 2L-S 
=a — — 
4r m? R? RÆ ie 
With the factor L - S, we speak of spin-orbit coupling. (This is stronger in nuclear 
physics, and leads there, with a box potential, to the “magic nucleon numbers” of 


the shell model.) The observables L, and S, are no longer sharp, but the total angular 
momentum J = L +S is, as indeed are J? and J,: 


2L-S = 21,8,+L,8_+L_8,=/7-L-S’. 


372 4 Quantum Mechanics I 


Fig. 4.18 Fine splitting of 9,3 
the first excited state 
multiplet of the hydrogen 
atom. Left: Inclusion of the 
spin-orbit coupling. Right: 
The result of the Dirac 2p 3 


theory, with splitting due to a = 


magnetic field of increasing 
strength. The Landé factor is 
1 2 1 
2 for s5 states, 3 tor Pa 
states, and 3 for p5 states 2l 2s5, 2p3 


We thus use the coupled basis {| (/s)jm)} from Sect. 4.3.10 and find 


‘ee ETE l forj=1+%4, 
The degeneracies of the hydrogen levels are thus lifted (for / > 0) by the spin-orbit 
coupling: (V) = a? Er (ag /R°) | for the 2/+2 states with j = +5 and 


(V) =o? Er (ag /R) (1+ 1), 


for the 2/ states with j = l — 1/2. The average value of the scalar product L - S is 
thus zero. This is a general sum rule. 

According to this, the first excited state of the hydrogen atom should split into 
three. The 285 state remains unaltered (as do all s-states), the 2p3 state increases 
by x a” Ep, and the 2p5 state is lowered by twice that value—the energies given in 
Sect. 4.5.5 are no longer valid to order wp. In fact, another fine splitting is found 
which follows only from the (relativistic) Dirac equation (Sect. 5.6.9). It leads to the 


result 
E 27/1 3 
E=-— (1 + — (| = =) + e) 
n n \j+5 4n 


and shows that the previously found degeneracy is only partially lifted. It depends on 
n and j, but not on / and m. The energy of 2p3 is lower than — + ER by a a Ep and 
that of 2p3 and 285 even by a a’ Ep. According to the Dirac equation, the average 
value is also lowered, and the splitting amounts to 5 a Ep (see Fig. 4.18). 
Incidentally, according to p. 366, we find for the hydrogen atom and n > l > 0, 


aò) 1 2 
(a) = w I+D I+D ` 


4.5 Time-Independent Schrödinger Equation 373 


According to this, the classical spin-orbit splitting differs by a factor of 2 from the 
corresponding splitting due to Dirac. 

Even though the spin-orbit coupling in atomic physics is clearly of the same 
order of magnitude as other “intricacies”, it is suitable as an example application, 
since in nuclear physics it is the spin—orbit coupling which leads to the magic nucleon 
numbers, as mentioned above. In addition, these considerations support the following 
chain of thoughts. 

The directional degeneracy is lifted by a magnetic field which we would now like to 
consider in perturbation theory. According to p. 327, we should use the Pauli equation 
for electrons. We neglect the term proportional to A*, which leads to diamagnetism, 
a generally very small effect: 


Y=- B. (L+29). 
2mo 


If we quantize along the magnetic field, then according to perturbation theory, L; + 
2S, = J, + Sz is important for the state |(/ 5)jm). The first term on the right has the 


eigenvalue m ħ, so only the expectation value of S, in the state | (/ 5) jm) is missing. 
According to Sect. 4.3.10, for this purpose, it follows that 


2j+1 


V) = B, withthe Landé factor g = i 
(V) =mg upg wi e Landé factor g AFi 


because in the uncoupled basis, we have S,|/, m; 5, ms) = m;h and the Clebsch- 
Gordan coefficients of p. 337 then deliver 


1 1 1 
(5) z mlS) E ,m) = +mħ/(21+ 1). 
This result of perturbation theory is true only for small external magnetic fields, such 
that higher-order terms can be neglected. 


4.5.9 Summary: Time-Independent Schrödinger Equation 


This Schrödinger equation is a second order differential equation for the unknown 
wave function. For bound states, the solution must vanish at infinity in order to 
be normalizable, and only then can it deliver a probability amplitude. On account 
of these boundary conditions, the time-independent Schrédinger equation becomes 
an eigenvalue equation for the energy. For unbound states, there is no eigenvalue 
condition: the energy can change continuously, and the improper Hilbert vectors 
serve only as an expansion basis for wave packets. 

Since, according to the uncertainty relation, position and momentum, and hence 
also potential and kinetic energy, cannot be sharp simultaneously, there is a tunnel 


374 4 Quantum Mechanics I 


effect in quantum mechanics: for a given energy, there is a finite probability of finding 
a particle in classically forbidden regions. 

Particularly important examples of the application of the time-independent Schré- 
dinger equation are harmonic oscillations (their energy spectrum is equally spaced 
above the zero-point energy) and the hydrogen atom, or more precisely, the Kepler 
problem V(r) œ r7! (with countable energy eigenvalues E,, = —E/n* for bound 
states and continuous eigenvalues E > 0 for scattering states). Free motion and piece- 
wise constant potentials are even simpler to treat. 


4.6 Dissipation and Quantum Theory 


This section goes beyond the usual scope of a course entitled Quantum Mechanics 
I and, apart from Sect. 4.6.4 on Fermi’s golden rule, can be skipped or studied only 
after the Chaps. 5 and 6. 


4.6.1 Perturbation Theory 


The Dirac picture is applied in particular to the coupling of atomic structures to their 
macroscopic surroundings. Without this influence we would not be able to observe 
atomic objects at all, since all detectors and measuring instruments belong to the 
macroscopic environment. (Hence this section is indispensable for the theory of the 
measurement process, although we shall not pursue this any further here.) We observe 
only a few degrees of freedom, but we have to consider their coupling to the many 
degrees of freedom of the environment. The difference between these two numbers 
is essential for the following. Hence we shall use the abbreviations “m” and “f” (for 
many and few) to indicate the two parts. Of course, it would be impossible to follow 
the many “inner degrees of freedom” of a solid separately. They have to be treated 
like those of the environment. At any given time, we observe only a few degrees of 
freedom of the system. 

Let us consider, e.g., an excited atom, which emits light. In the simplest case we 
may consider the atom as a two-level system and the environment as the surrounding 
electromagnetic field. Even if it was initially particularly simple (without photons), 
the light quantum (photon) can still be emitted from many different states, these 
being distinguished, e.g., by the propagation direction, but also by the time of arrival 
at the detector. 

For these considerations, pure states alone are not enough. In particular, averaging 
effects will enhance the degree of “impurity”, so we describe everything with den- 
sity operators. For their time dependence, in the interaction picture (Dirac picture), 
according to p. 346, we have 


4.6 Dissipation and Quantum Theory 375 


., dpp 
ih — = [ Vp, ; 
EP [Vp, pp] 
where the operators p and V act on both parts. But only the few degrees of freedom 
of the open system are of interest, and hence also only the equation of motion for a 
simpler reduced density operator will concern us, viz., 


pe = mp, 


since we consider only measurable quantities Or which do not depend on the many 
degrees of freedom and hence are unit operators with respect to these degrees of 
freedom: 


(Of) = ttyw PDOs = trf pfOF . 


In particular, we shall derive an equation of motion for pf from the expression for 
Pp. The result will not be a von Neumann equation: open systems differ in principle 
from closed systems. 

Concerning the experimental conditions, we require that initially the “object” and 
“environment” should be independent of each other (“uncorrelated”), so that initially 
pp factorizes into pf and Om = tt¢ep (more on the notion of correlation in Sect. 6.1.5.) 
This initial condition is suggestive, because for each repetition of the experiment, we 
produce the object as identically as possible, but the environment has far too many 
possibilities of adjustment. Often we simply require that the coupling necessary for 
the correlation should be turned on only at the beginning of the experiment—for the 
discussion below, both requirements deliver the same result. 

Using the product form, the number of independent density matrix parameters is 
much reduced. If we pay attention only to p = pÏ, but not to tro = 1, then an N xN 
matrix requires N 2 real parameters, but for the product form, instead of the (Na No? 
parameters, only Nm + Ne are needed. Generally, for uncorrelated systems, we 
have tro? = tro? - trp;, otherwise this is not true—correlated systems form entangled 
states. For example, the singlet state of two electrons in the spin space has tro? = 1, 
but tro? = $. 

If the parts are not coupled, then for all times, pp could be split into the product 
Pm ® pr. But the interaction leads to a correlation. Hence we write 


Pd = Pm ® prt Px, with px(0)=0 (and trox(t) = 0, not 1) . 


Then we obtain ifs = trm Vp, Pm ® Pf + pk] and a corresponding expression for 
ihfm. The term trm[ Vp, Pm ® prf] is equal to the commutator of trm Vp Pm ® 1¢ with 
pr, where ttmVpPm ® 1¢ describes the average interaction of the environment with 
the experimental object. It can be taken as a part of the free Hamilton operator H, and 
correspondingly treVp 1m ® pf for Hm. Then these terms for the interaction vanish, 
and we find 


376 4 Quantum Mechanics I 


. dof . dom 
h — = trm[ Vp, d h — = tr[ Vp, : 
ET a [Vp, px] an seer tL Vp, px] 


Since px is of at least first order in the interaction, the changes in pf and pm with time 
in the Dirac picture are at least of second order in Vp, and this can be exploited in 
perturbation theory. 

The correlation px changes by one order less: 


. dok 

h — = [Vp, Pm : 

iñ -i [Vp Pm ® Pr] 
Here on the right, the expression [Vp, Ox] — tre Vp, Px] © Pf — Pm 8 ttm Vp, Px] is 
left out, because it depends on a higher order of the coupling. Hence, with regard to 
the initial value, 


ao => Í di’ [Vp(t’), pm(t’) ® Pe) 


The final result is thus a coupled system of integro-differential equations: ppg follows 
from an integral of om and pr and these quantities from differential equations which 
depend on px. In particular, for the unknown pf, we now have the equation 


GO evü a! voe t ! 
PF = = tral oo. | t [Vp (1), Pal) ® pN 


t 
= =y | Ar mnl Vol. WO) p(t) ® A0 + he. 
Here use was made of the fact that the operators are Hermitian. The double commu- 
tator can then be reformulated into two simple commutators. 

In order to further simplify the equation, we decompose the coupling Vp into 
factors which each act only on one of the two parts, although there are several such 
products, and only their sum delivers the full coupling: 


W=) CL@vVe. 
k 


Then, e.g., for a two-level system, a V; may occur (even though Vp is Hermitian, 
the factors on the right-hand side may not be—there are further terms which ensure 
Vp = Vp‘) and both are interconnected with appropriate factors Cm. However, this 
does not mean that each has only one creation and annihilation operator. In fact, each 
factor Ck, embraces a huge set of basis operators (modes) for the environment. But 
since we are interested only in a few degrees of freedom and, when we form the 
trace, we average over many degrees of freedom, the notation is rather useful. Here 
for the time being we shall not fix the normalization of the basis operators C*, so 
the yi will remain undetermined. 


4.6 Dissipation and Quantum Theory 377 


Hence, the integrand splits up into factors for the individual parts: 


tml V(t), Vo) om(t’) ® P] = Y tr CROCK mC} VEO, VEC) P] - 
kk’ 


Here the influence of the part with many degrees of freedom is contained in the 
factors 


g(t, t) = tim{Ch@ CEC) Pm) 


If they are determined, then a decoupled integro-differential equation remains for 
the unknown density operator pr : 


d 1 tg 
= =z 5 Lf (t,t!) VEU) pelt’) de’, vi | +h. 
0 


kk’ 


4.6.2 Coupling to the Environment 


So far we have respected the two parts as equivalent terms in a weak coupling and 
have not yet made use of the fact that they differ essentially by the number of degrees 
of freedom. This difference allows us to estimate the weight functions g* and to 
simplify the integro-differential equation. 

As discussed in Sect. 4.4.4, p(t) = o (0) + Gif)! five, p(t’)] dt’ solves the 
initial equation if ò = [V, p], but since the unknown p(t) appears on the right, the 
solution is not found yet. In a perturbation theory, we replace p(t’) in the integrand 
by the initial value o(0) and then obtain at least an approximate solution. In the 
given case, we do not need this approximation for pr(t’); only for pm(t’) will it be 
necessary. In particular, it will turn out that g(r, ¢’) puts the main weight on f’ ~ t 
and hence the main weight of pp is only for the time t. 

Here we start from the fact that the environment is initially in equilibrium. Oth- 
erwise we would also like to obtain the response of the considered object to new 
environmental conditions, which is in fact also an important question, but will only 
be investigated afterwards. 

Without coupling of the two parts, the environment would remain in its initial state. 
We now assume that there is no feedback: so the object perturbs its environment 
(otherwise we could not investigate it at all), but not so strongly that it would be 
noticed, otherwise we would have to fix the boundary between the two differently. 
Hence, 


g (t, t) St E) CEC) pa). 


378 4 Quantum Mechanics I 


The “recurrence time” expected for a given closed system depends on the feedback. 
But with environmental conditions, we shall introduce a damping of the open system 
which prohibits this feedback. 

With Cm(t) = Um'(t) CnUm(t) and Um(t) Um (t) = Um(t — t'), and because 
Pm(0) is stationary and hence commutes with U,,(t’), it follows that 


2” (t, t) = timnC’ E — t) CEO) pm (0) = g*a- r’). 


Thus only the time difference is important for g’, and the energy representation is 
therefore particularly useful: 


i(Enm — Enn) (t — t) 
. 


g*a- 1) = > (nmlCh lm’) (Mm 1CK Inm) Uml Omlm) exp 


NN! 


Here the many degrees of freedom are reduced to a nearly continuous eigenvalue 
spectrum of the environmental energy E with the state density g,(E). We replace 
the double sum by a double integral, 


i(E’ — E)" 
gU) = ff AEE gC) gE") (E'ICKIENEICKIE) PaE’) exp AE, 
and now make the ansatz pm(E’) = gm ~!(Eo) 6(E’ — Eo). The factor gm`!(Eo) 
follows from the normalization condition f dE’ gm(E’) pm(E’) = 1. (Actually, we 
should start from a thermal distribution with a temperature T, but this is not impor- 
tant here.) Hence, we obtain 


` / 
gi") = f AE ga(E) (EolChIENEICK IEn) exp 2 
When forming the trace, we clearly require that an annihilation operator C ua always be 
followed by its adjoint creation operator. Hence the product in front of the exponential 
function is real (and non-negative). In the last equation of Sect. 4.6.1, in the Hermitian 
conjugate expression, where g% (t, t’) is actually to be replaced by g**(t’, t), we may 
now also have g***(t — tr’). If we rephrase k <> k’ there, then we arrive at 


m he alle Ka — 1) VE) ot’) dr’, vito) | 
Hvo, f g E — t) olt) VEC) ar'|} , 
0 


This integro-differential equation can still be simplified quite decisively using the 
Markov approximation. 


4.6 Dissipation and Quantum Theory 379 


4.6.3 Markov Approximation 


Since the environment has many different eigenfrequencies, g% changes fast in 
comparison to pf. The “memory” of the environment is much shorter than that of 
the atomic object. We therefore expect g% to decrease rather fast towards zero 
with increasing |f — ¢’|. Hence we take ,(t’) in the integrand for t = t (Markov 
approximation) and may then extract it from the integral, whereupon the integro- 
differential equation becomes a simpler differential equation. The change in pp at 
time ¢ then depends only on the simultaneous value of pp and no longer on the earlier 
values. Hence we introduce two dimensionless auxiliary quantities and if g(t”) 
tends sufficiently fast towards zero, we may also integrate to infinity: 


, 1. fe Sp oe 
Ak“ = =f gt") U;(t”) vi Up C") dt’, 


=, 1 we i Ee 
Akk = if ae *(1") U;(t") Ve U; C”) dt’. 


With AM (t) = U; (t) A% U;(t), we then obtain the differential equation 


fe = ; So AFO or), VEOH IVE, pr) AF (OH , 
kk! 

where the operators Ae and Ace still have to be investigated in more detail. 

Hence, we assume that VE changes the energy of the state by SEK, and likewise 
for Ae. while AK changes it by SEK. If we average now over the fast processes and 
pay attention only to the contributions of the slower parts, we have SEF = SEK. 
For the excitation of an atom by a transverse electromagnetic wave, this procedure 
is called the rotating-wave approximation, because these terms seem to be slowly 
variable to an observer rotating along with the wave. In each of the two commutators, 
there is a creation and an annihilation operator Vp. Hence, we have A = r a’ VE 
and At* = m a** VĚ, using the common abbreviation 


i(Eo — E + 5Ef) t” 
; : 


: 1 , LS 
a* = [oe gm(E) (Eo|CE\E)(E|C* |Eo) zf dt” exp 
0 


The differential equation under consideration then simplifies to 


d ; , , 
a g ; Sy Rea (VEO pr), VECO] + [VE O. er VEO) . 


kk’ 


since Ima is multiplied by [VÉ (t) ve (t), o¢]. This commutator is not important in 
the present discussion, because we shall not occupy ourselves with the determination 
of Hp here; we would only obtain an amendment to the Hamilton operator, e.g., for 


380 4 Quantum Mechanics I 


the electromagnetic coupling of an atom to the surrounding vacuum, the famous 
Lamb shift. According to Sect. 1.1.10, the real part of the integral over t” is equal to 
ha 5(Ey — E + E$): 


Rea’ = gm(Eo + Ef) (Eo|Cn|Eo + SEF) (Eo + SEt|Cy Eo) - 


Here, for appropriate normalization of the operators V¥, we may take the factors 
ck and CE as Bose operators W and Wyt. This is true for SEK > 0; for 5Ek < 0, 
conversely C% is to be replaced by Yt and cK by Yw. In the following we shall 
write +6E instead of E% and assume ôE > 0. 

If no degeneracy occurs, then k and k’ are uniquely related to each other, and 
instead of the double sum, a single sum suffices. Note that, for an isotropic environ- 
ment, we have in fact the usual directional degeneracy, but trek (t—1') ce (0) Pm(0) 
then also contributes only as a scalar, and this again relates k and k’ to each other 
uniquely. For gm(Eo + ÔE) stands the factor Y,W,", for gm(Eo — SE) the factor 
W Y, = Ng. With [W, Wt] = 1, the factor for gm(Eo + ÔE) is therefore greater by 
one than that for g,(Eo — ÔE). In Sect. 6.5.7, it will be shown that, for thermal 
radiation, we have mn, = {exp(ha,/kgT) — 1}~!, where the factor kg in front of the 
temperature T is the Boltzmann constant, and the normalization volume V has the 
state density g&m(E) = VE?/(27h'c°). For the coupling to the vacuum (for sponta- 
neous emission), we naturally work with n} = 0 (or T = 0) so that only the term with 
&m(Ep + SE) appears, and not the term with g,,(E) — SE). Then there is only forced 
absorption, described by Hp, but both forced and spontaneous emission. (For T > 0, 
there is also spontaneous absorption.) Spontaneous processes are not described by Ht, 
but by the dissipation discussed here. Taking all this together, if Hr is not degenerate, 
we then obtain 


do x 
oe 2 lsi [VEO ot), VEO] + 8 LVE® p), VEON + he. 


Here we have left out the subscript f, because all operators now refer to the few 
relevant degrees of freedom anyway. In addition, with dE appearing implicitly in k, 
we have 


gi = Tk 8m(Eo — SE) and gh = (Fik + 1) 8m(Eo + ÔE) . 


Note that, if there is no spontaneous absorption, then 7, = 0 and hence also g* = 0. 
The Hermitian conjugate of g+[V+p, V+] is equal to g+[V+, o V+]. With 


g+[V+0, V=] + h.c. = g+ (2VzpVs — {V+Vz, p}) , 


the equation of motion is often reformulated accordingly. 
If we return to the Schrédinger picture (without including the subscript S), then 
with time-independent operators Hp and V£, it follows that 


4.6 Dissipation and Quantum Theory 381 


do (Hr, 
= gl peol 4 = S {ek [VE p(t), VEJ + gE [VÉ pW), VŽ] + h.c.} . 


We shall apply this Liouville equation to different examples. It conserves the trace 
of p, because do/dt can be expressed purely in terms of commutators, but not the 
purity of the state, since we generally have 


d 2 2a k k k k k k 
a 2 let vEp, VÉ] p) + gktr(LVip, VŽ] p)}, 


which differs from zero. Hence for jul suas there is also no unitary operator U © 
with the property p(t) = U (t)p(O)U" '(t). Incidentally, for real g+ and Vi = = Vz, 
p = 0 is also conserved, as was proven by Lindblad in 1976. 

There may still be amendments go[Vo, VÝ] without energy exchange. These 
destroy the phases of the density operator. For example, for a doublet, we thus have 
p(t) = $} (1 + 0- (o(t))) and H = +hQo-, whence 


d ; 
a loz, p] 
dt 2 


[0:P, 0z] 
+ (ralo p, 04] + y_-[049, 0 }+Y%—{— + hc.) . 
Here yo captures the coherence-destroying processes without energy exchange with 


the environment, y+ those with energy delivery to it, and y_ those with energy intake 
from it, which is only possible for T > 0. Hence with 


(oz) = = (Y+ — VIVA + y-) , 


we find 


(Ox + iOy) = (Ox +i0y)o exp(~iQt) exp{—(y++y-+yo) t} , 
(Oz)1 = (Oz)oo + {(0z)o — (Oz)oœo} exp{—2(y++y-) t}. 


For y} /y- ~% (n+1)/7, we have (0;)o = — (27 + 1)7!, and thermal radiation 
7 = {exp (AN /ksT)—17' , 


whence the Bloch vector for kgT « AQ tends towards —e, and for kgT >> AQ to 
zero (see Fig. 4.19). 

Incidentally, we often see the claim that the dissipation might be describable by 
a non-Hermitian Hamilton operator H = R — il, with R = RÝ and J = I". Then, 
Ann* = Rin + iwn, and from ibn = ye Any Wn, for Pnn = Wn Wns the equation 
p = —(i[R, o] + {7, e})/h would follow. Here, in contrast to the previously derived 
equation of motion, the trace of o would not be conserved. Thus the ansatz H = 
R — il could not be valid generally—at most for special states, e.g., in scattering 


382 4 Quantum Mechanics I 


Fig. 4.19 Spiral orbit of the Bloch vector with damping by an environment at temperature T = 0. 
Without damping, according to p. 343, it proceeds on a circle with axis tr(oH). The damping leads 
to a spiral orbit. Here y;+y_ = yo is assumed, so the orbit lies on a cone, unless it already starts 
on the axis. Larger yo narrows the orbit towards the axis and perturbs the coherence even faster. For 
T > 0, the attractor (open circle) lies higher, for kT >> AQ in the center 


theory, we consider “decaying states” (see Sect. 5.2.5), the probabilities of which 
decrease in the course of time. 

For degeneracy of Hr, the situations are not quite as simple, since the index k 
actually belongs to VE, while Ck embraces many modes, and now for k # k’ in C% 
and Ce. the same modes may occur, so we may no longer separate the factor 5‘ 
from ee Of these, only the mutually degenerate states are captured—instead of the 
term with k, many terms now occur, corresponding to the degree of degeneracy. We 
shall discuss this problem in more detail in Sect. 4.6.5. 


4.6.4 Deriving the Rate Equation and Fermi’s Golden Rule 


The Schrödinger and von Neumann equations lead to the time-development operator 
U (t) = exp(—iHt/h), and hence immediately after the beginning of the experiment 
to U ~ 1 — iHt/ħ. If initially only the energy eigenstate |no} is occupied, then the 
occupation probabilities immediately after the beginning of the experiment do not 
change linearly, but quadratically with the time, i.e., (n|o|n) ~ |(n|H|no) t /h|? for 
n Æ no. Actually, the occupation probability is expected initially (for small ft) to 
increase linearly—the quadratic dependence is so surprising that it is even referred 
to as the quantum Zeno paradox. 

But linear behavior follows immediately from the Liouville equation just derived, 
since for the diagonal elements of the density operator in the energy representation 
(H;|n) = |n) En), it delivers the rate equation (occasionally also called the Pauli 
equation, but which must not be confused with the non-relativistic approximation of 
the Dirac equation mentioned on p. 327): 


4.6 Dissipation and Quantum Theory 383 


d 
een = > (Waw (n'|o|n') — Wan (n|p\n)) : 


1 


n 


with the transition rate 


27 hae 
Wan = |, 8+ |(n|V=|7')| ’ for En S Ev , 


where the index k becomes fixed by n and n’. Note that the transition rate is often 
referred to as the transition probability, but is not normalized to 1. It gives the 
average number of transitions in the time dt. As for operators, we shall write the 
initial state after the final state here, even though we are not strictly speaking dealing 
with matrix elements in the usual sense. If we swap n’ and n, we obtain Wy, = 
2m g4 |(n'|Vz|n)|"/h for Ey S En, as is indeed required. In particular, we often also 
use the abbreviation (without terms n’ = n) 


In = oe Win ’ 
n’ 


which is already useful in the above-mentioned rate equation. We shall discuss such 
rate equations in Sect. 6.2 and use them to prove in particular the entropy law for 
“closed systems”, i.e., systems separated from their environment, but which also have 
many internal degrees of freedom in addition to a few observable ones, and hence 
according to Sect. 4.6.1 fit into the framework considered here. Energy is conserved 
in such closed systems, so we may set gh = ef and obtain Wry = Warn. 

Since the change in the atomic system is only relatively slow, this suggests using 
the initial values on the right-hand side of the rate equation and then determining 
the derivatives with respect to time, without first integrating the coupled system 
of equations. If initially we have the pure state |no), Fermi’s golden rule for the 
determination of the transition rates follows for all states |n) 4 |no): 


d(n|p|n) 27 
= Wnn = 
dt ° À 


2 
gl (n| V |no)| ey for En sS Em . 
Since the rate equation conserves the trace of p, we now also infer 


d{nolplno)/dt = — X` Wim = —Pno/h , 


initially, i.e., for t & A/T m. 

For the off-diagonal elements of p, the so-called coherences, as long as we leave 
out the terms go (2VoV — V?p — pV?) with energy-conserving V = VŽ, from the 
general result of the last section, we obtain 


d(n|p|n’) = (= = Ew Pan + Py 


/ / 
r = = ) aoln’) , forn én. 


384 4 Quantum Mechanics I 


In particular, for E, S Ew, from 


8+(2V_pV, —{V,V_, p}) + g-2Vi0V_ — {V_V,, p}, 


only the part —g+ p V+ V+ — 8+ V+ V=+p contributes, because the creation and anni- 
hilation operators each connect only two states to each other—only the sum over k 
comprises all different states. Addition of the term with the factor go, viz., 


(n|2VpV — V?p — pV? |n') = —((n|V|n) — (IVI? (aoln) , 


increases the damping in comparison with the expression we have kept here. In this 
way, the differential equations decouple and lead to 


5 Tn F Ty) +i (En = Ew) t) 


(lon) = (alo ©’) exp( 5 


or even more strongly damped. The coherences thus decrease with time. The density 
operator in the energy representation finally becomes diagonal, and occupation prob- 
abilities also become classically understandable. This was discussed for doublets in 
the last section, and shown in Fig. 4.19. There we had W} = 2y, and Wy, = 2y_. 


4.6.5 Rate Equation for Degeneracy. Transitions Between 
Multiplets 


When we have degeneracy, we have to consider still further states. We shall denote 
them with a bar, viz., |7) and |7) will be degenerate with |n}, and |n’) with |n’). Instead 
of the rate equation for the occupation probabilities, we have 


ewer 


Din (n|p|n) ot Daa (n|p|n) 
2 


= Ý Wain (n'|pli’) J l 


nn ñ 
with 


w 20 (niv. 
nwnw = 8+ \N| V 
hi g 


n(n |Van) for En S Ey , 


and Dyn = A > Wr'n’nn'- (When there was no degeneracy, we introduced Wy, = 
Winn and Fp = Pry.) In contrast, for the matrix elements of p between states of 
different energy, it follows that 


d(n|p|n’) En oe Ew n Pan (n|p|n’) + i’ Pyar (n|o|n’) 
=— (n|p|n’) 2 2 . 
dt ih 2h 


4.6 Dissipation and Quantum Theory 385 


Here the sum over 7 also takes the value n, the sum over 7’ takes the value n’, and 
above, the sum over n also takes the values n and ñ. 

The directional degeneracy of the angular momentum multiplets delivers an 
important example. Instead of |n), here it is better to write |jm). In the following, E; is 
the energy of the ground state and Ey is the energy of the excited state. If we restrict 
ourselves to the coupling to the vacuum, with g_ = 0 and g} = gm(Eo + SE), then 
we have 


Wimm jimm” = z 8&+ (Gm|V |j'm") (i'm |V jm’) , 
in addition to Wmm” jmw = 0 with g_ = 0. The vacuum does not prefer any direc- 
tion, and hence leads to a special selection between k and k’. The two interactions 
couple only to a scalar. We restrict ourselves to radiation of multi-polarity n (usually 
dipole radiation, i.e., n = 1, but in nuclear physics, higher multipole radiation also 
occurs) and use the Wigner—Eckart theorem: 


m v 


a GIL V® V) 

VFL ` 
This means that the directional dependence of the matrix elements is included via the 
Clebsch-Gordan coefficients. Then only one reduced matrix element (j|| V® ||’) and 


the factor (2j + 1)~!/* remains, split-off in such a way that, for a Hermitian operator, 
the symmetry |I| V YOI = |G] V™ ||7)| remains. The above-mentioned isotropy 


delivers 
n aot 
m v 


A ma’ ,j'm" m" 
h Wimm jimm” a j EA 2 


and hence, using the orthogonality of the Clebsch-Gordan coefficients, 


uni vir yin!) = (J, 


’ 


4 GV Wy? 
2 +1 


iv Wy? 
Lymm” = (DD Wimm jimm" = 20 8+ +l Ôm'm" . 
m 


We note that m” = m” has to hold, whence Djinn here does not depend on the 
directional quantum numbers. Hence we set 


Lav WW)? 


Tj = 20 8+ 7 ma I 


and obtain, for the matrix elements of the density operator in the upper multiplet, 


d(j’miplj'm') 
dt = 


T j'mpm) , 
h 


386 4 Quantum Mechanics I 


for those in the lower multiplet, 


w T 1 
= 0 Won jwn + (im |plj m E 


m'm" 


and for the matrix elements between the two multiplets, 


d(jm|p|j'm') = Ej — Ej AVE yd 
dt ~ ( ih a maim» 


Here the properties of the Clebsch—Gordan coefficients lead to 


j n |f\(j n 5D 
Wimm jim'm" = m m’—m| m" m m" —=m |m" A Ôm-m ,m"—m" ’ 


since all other terms vanish. Consequently, all sub-states of the excited levels decay 

with the same time constants—and the amplitudes of the coherences (jm|p|j’m’) 

decrease exponentially with time, but only with half the time constants. If all sub- 
states of the excited levels were initially occupied with the same probability and 

of the ground state were unoccupied, so that initially (j’m’|varrho|j'm’”’) = 
m'm” /(2j' + 1), it then follows that 


d(jm|p|jm’) 2 Ty mm 
dt Ge: 


if we make use of the properties of the Clebsch—Gordan coefficients. 


4.6.6 Damped Linear Harmonic Oscillations 


An important example is provided by the oscillator coupled to its environment. It is 
without degeneracy, but has only one creation and one annihilation operator between 
its states—as long as we neglect multi-quantum processes for the damping (like U7, 
but also YY for V*). Hence the index k is superfluous, and we set V} = v Wi and 
V_ = v Y with [W, YŻ] = 1. The result of Sect. 4.6.3 then takes the form 

= = jij. o ae g+ [Wp, varPsi"] ré [Wip, Y] +h. c. l 


Note that expressions like [V*Wp, WIW] + h. c. lead to pure phase damping, which 
we shall not pursue here, and multi-quantum processes are still possible. Using the 
abbreviations 


4.6 Dissipation and Quantum Theory 387 


yerv HOE and (ty), = 2 _, 
h eee 
we obtain the differential equations 
d(wiw) ; i d(Y’) , i 
g T T2 PY) = (PPh) and a O es 


which can be integrated easily: 


(WW), = (WIE) + {V E) — (WTW) 0} exp(—2yt) , 
(Y'= (Y')o exp(—ilot) exp(—lyt) . 


This result is similar to what we found for the two-level system (see Sect. 4.6.3). 
However, now (W'W),, = g_/(g, — g_). With g4/g- ~ (7+ 1)/7, the average 
excitation energy approaches the value fwn, hence the average excitation energy 
of the environment— for thermal radiation, we have 7 = {exp(hw/kgT) — 1}~!, and 
for the vacuum it is equal to zero. 

Since X and P are linear combinations of W and Yt, for the damped harmonic 
oscillation, (X) and (P) decrease at the rate y independently of the initial state, while 
for stationary states the final state is already reached at the outset (see p. 388): 


w gg AED 


Xo Po 


= (Y); = (W)o exp(—iwt) exp(—y?) . 


Classically (in Sect. 2.3.7), for y « @,i.e., weak coupling to the environment, we 
have the same result, as Ehrenfest’s theorem confirms (Sect. 4.4.1). But classically, 
we do not have the uncertainties: 


AX\2 /APy2 + 
(=) (=) = itot L Eo WoE) (HTH) oo} exp(—2y0) , 
x0 PO 

es 


2 /AP\2 
a) (F) = ReVo (V0?) exp(—2ie} exp(—270) . 


Po 


In the course of time, AX /xo and AP /po take the same value, which is determined 
solely by the environmental temperature and respects the limit AX - AP = Txopo = 
sh set by Heisenberg’s uncertainty relation. 

In addition, the initial values of (X), (P), AX, and AP clearly do not yet fix 
the uncertainties, since (W?)y — (W)o? is a complex number and therefore requires 
further input (namely its rate of change, or the direction of the ellipse in Fig. 4.20). 

The example shown in Fig. 4.20 comes from an initial “quench state”. This will 
be discussed in Sect. 5.5.4. These are pure states and have AX /xo # AP/po (hence 
the name), but the smallest possible uncertainty product AX - AP, i.e., txopo = sh. 
There are of course states for which the product of these uncertainties is greater. For 


388 4 Quantum Mechanics I 


7 N 
4 x 
/ \ 
/ gr. N \ 
/ 25S * \ 
¢ \ 
1 
l 7 
\ i 
i Na af / 
\ \ =r / / 
\ \ 7 / 
N 


\ or / 
\ / 
Ne 4 


Fig. 4.20 Phase space representation of damped linear oscillations according to quantum theory— 
with equal damping as in the classical case (see Fig. 2.21). Except for the values (indicated already 
there) for (X /xo) and (P/po), the uncertainties AX /xo and AP/po can still be read off here. They 
remain finite, but always become more similar with time. The circle in the middle shows the final 
state. Of course, for the uncertainties, other initial conditions could be valid, as drawn here 


Fig. 4.21 Time-dependence E* /E*o (4AE* / E*o) 
of the excitation energy E* 
(dashed green curve) and its 
uncertainty for the same 
damped oscillations as in 
Fig. 4.20, here relative to the 
initial energy E*o. 
Continuous blue curves 
show (E* + AE*)/E* 9 for 
the initial state there, and 
dotted red curves the same 
for initially sharp energies 


(W'), the above-mentioned phase damping leads to a factor exp(—/? yor) which also 
affects the uncertainties AX and AP, but not the energy. 

Figure 4.21 shows the time dependence of the excitation energy (E*) = (YÍ Y) ha 
and its uncertainty. With (AX /x9)* + (AP/po)? = (WTW + 5) — (YÏ) (Y), the 
energy is already fixed with the initial values introduced so far, and its uncertainty 
only by the further initial value ((W*W)*)9, which for quench states can be deter- 
mined using the normal-ordered characteristic function introduced in Sect. 5.5.6 (see 
p. 481): 


4.6 Dissipation and Quantum Theory 389 
(HTW) = F(T), + CT o — 3 (YTW)o} exp(—4y0) . 


Thus it can also be zero initially, for (PY)? = ((WTW)_)?, but this dependence 
of the initial uncertainty is rather quickly damped, as Fig. 4.21 shows. 


4.6.7 Summary: Dissipation and Quantum Theory 


The coupling of an object to unobservable degrees of freedom induces dissipation. 
The energy does not remain conserved. Classically, this is assigned to friction, which 
is inaccessible to Hamiltonian mechanics. In quantum theory we also require exten- 
sions which go beyond the von Neumann equation (and the Schrödinger equation). 
Dirac’s perturbation theory helps quite a bit here, but further approximations (in 
particular the Markov approximation) are necessary, until the expressions can be 
evaluated. 

These lead to Fermi’s golden rule among other things. The derivative of the occu- 
pation probability of an energy state with respect to time, thus the transition rate 
from the initial to the final state, is equal to the square of the absolute value of its 
coupling to the initial state times the state density of the relevant reservoir (for finite 
temperatures, there is one further factor), except for a factor of 2x /ħ. 

But we have also found out how the coherences (the non-diagonal elements of the 
density operators) depend on time. Their damping ensures decoherence: quantum- 
physical phase effects vanish in favor of classically understandable occupation prob- 
abilities. Decoherence leads to a collapse of the wave function. It is often overlooked 
that we always deal with a statistical ensemble, and by selecting a special state, we 
prepare the old state anew. Decoherence thus leads from quantum physics to classi- 
cal physics, which is essential for each measurable process, since only then can we 
arrive at classically realizable situations. 

As important as these results are, there remain essential example applications for 
further chapters (Quantum Mechanics II). We have not yet dealt with many-body 
problems (where in particular the fact that the particles are indistinguishable has 
noteworthy consequences), nor with scattering problems and relativistic effects. 


Problems 


Problem 4.1 Which probability amplitude (x) fits a Gauss distribution |y(x)|? 
with x=0 and Ax 40? What does its Fourier transform 


Wk) = exp(—ikx) w(x) dx 


Tel 


390 4 Quantum Mechanics I 


look like? Show that the factor 1/v 27 here ensures TE |y (k)|? dk = 1. Determine 
Ax - Ak for this example. (6 P) 


Problem 4.2 Given a slit of width of 2a, assume that the probability amplitude 
w(x) = 1//2a for |x| < a, otherwise zero. How large is Ax? Determine the Fourier 
transform. Where are the maximum and the neighboring minima of |y (k)|?, and how 
large are they? Show that the “interference pattern” |Y (k)|? becomes more extended 
with decreasing slit width, but that the product Ax - Ak is problematic. 

(6 P) 


Problem 4.3 Consider the Lorentz distribution 
1 
IOP x 1/{(@ = w)? + GY". 


How large is the uncertainty Aw, and how large is its half-width, i.e., the distance 
at which |y (w)|? has decayed to half the maximum value? Show that y (w) is the 
Fourier transform of y(t) x exp{—i(@o — iiy) t} for t > 0, zero for t < 0. Can we 
describe decays with it? How large is the time uncertainty At? (8 P) 


Problem 4.4 The transition from the initial state |Z) to the final state |f} should be 
possible via any of the states |a), |b), and |c). How large is the transition proba- 
bility |(f|i)|* if the states |a) and |b) may interfere, but |c} has to be superposed 
incoherently? (2 P) 


Problem 4.5 Prove f% f(x) 8 (x—x’) dx = (—)" f(x’) for square- 
integrable functions using integration by parts. Deduce from this that the equation 
x 6'(x) = —6(x) is true for the integrand. Prove 6(ax) = JS (x). (6 P) 


la| 


Problem 4.6 A series of functions {g,(x)} forms a complete orthonormal set in 
the interval from a to b, if J En* (X) Bn (x) dx = nv and f (x) = X, 8n(X)fn for all 
(square-integrable) functions f (x). How can the expansion coefficients fy be deter- 
mined? Expand the delta-function (x — x’) (with x’ € [a, b]) with respect to this 
basis. Does the sequence g,„(x) = (2a)~ 1/2 exp(ianx/a) form a complete orthonor- 
mal system in the interval —a < x < a? (6 P) 


Problem 4.7 The system of Legendre polynomials P,,(x) is complete in the interval 
|x| < 1. The generating function is 1/1 — 2sx + $ = ~~) P,(x) s for |s| < 1. 
How does the associated orthonormal system read? Show that the Legendre polyno- 
mials may also be represented by 


P,,(x) = 1/{2" n!} d"(x? — 1)"/dx" (Rodrigues’ formula). 


(6 P) 


Problem 4.8 The normalized state |W) = |a)a+ |B) b is constructed from the 
orthonormalized states |œ) and |6). What constraint do the coefficients a ~ 0 and 


Problems 391 


b £0 satisfy? How do they depend on |)? Determine which of the following 
normalized states |w;) are physically equivalent to |w) (disregarding the phase 
factor): |y1) = —|a) a — |B) b, |W2) = |æ) a — |B) b, |Y) = |æ) ae”? + |B) be™'?, 
lW4) = |æ) cosg + |B) sing. (6 P) 


Problem 4.9 Does the sequence of Hilbert space vectors 


0 0 
1 0 
O|> 1 [is oss 


’ 


1 
0 
0 


converge strongly, weakly, or not at all? If so, give the vector to which the sequence 
converges. (4 P) 


Problem 4.10 Consider the function w(x) = x for =r < x < x. How does it read 
as a Hilbert vector in the sequence space if we take the basis {g,(x)} of Problem 4.6 
(with a = x)? How does the Hilbert vector in the function space read if it has the 
components Ya = 6,,1 + ôn,—1 in this basis of the sequence space? (4 P) 


Problem 4.11 Are the functions fo(x) « 1 and fı (x) « x orthogonal to each other 
for—z < x < m? Determine their normalization factors. Extend the orthonormalized 
basis {fo, fi} so that it is complete for all second-order functions f (x) = ap + a) x + 
ax in- <x. (6 P) 
Problem 4.12 Determine [A, [B, C]+] + [B, [C, A]+] + [C, [A, B]+] and 
simplify the expression [C, [A, B]+]+ — [B, [C, A]+]+. Is 


(A[B, C]+ ~ [C, A]+B)D F C(A[B, D]+ = [D, A]+B) 


a simple commutator? (6 P) 


Problem 4.13 Let the unit operator 1 be decomposed into a projection operator P 
and its complement Q, viz., 1 = P + Q. Is Q also idempotent? Are P and Q orthogonal 
to each other, i.e., is tr(PQ) = 0 true? What are the eigenvalues of P and Q? 

(4 P) 


Problem 4.14 Is the inverse of a unitary operator also unitary? Is the product of two 
unitary operators unitary? Is (1 — iA)(1 +iA)~! unitary if A is Hermitian? Justify 
all answers! (4 P) 


Problem 4.15 Suppose (A — a; 1)(A — a21) = O and let |) be arbitrary, but not an 
eigenvector of A. Show that (A — a;)|y) and (A — az)|w) are eigenvectors of A, and 
determine the eigenvalues. Determine the eigenvalues of the 2 x 2 matrix A with 
elements A;z. If the matrix is Hermitian, show that no degeneracy can occur if the 
matrix is not diagonal. (6 P) 


392 4 Quantum Mechanics I 


Problem 4.16 Do orthogonal operators remain orthogonal under a unitary transfor- 
mation? (2 P) 


Problem 4.17 Why is the determinant of the matrix elements of the operator A equal 
to the product of its eigenvalues? (4 P) 


Problem 4.18 Let the vectors a and b commute with the Pauli operator o. How 
can (a- o)(b- ©) then be expressed in the basis {1, o}? What follows for (a - o)? 
and what for the anti-commutator {a - o, b - o}? Expand the unitary operator U = 
exp(ia- ©) in terms of the basis {1, © }. (6 P) 


Problem 4.19 The boson annihilation operator W is in fact not Hermitian and 
therefore does not necessarily have real eigenvalues, but any complex number y 
may be an eigenvalue of W. Determine (up to the normalization factor) the associ- 
ated eigenvector in the particle-number basis, and hence the coefficients (n|y) in 
Iw) = X2 In) (nl). Why is this not possible for the creation operator Wi? For 
arbitrary complex numbers «œ and £, consider the scalar product (a|8) and determine 
the unknown normalization factor. (8 P) 


Problem 4.20 Show using the method of induction that 


wn wi n E D (+)'m! n! wi n—-l wal : 
win | I (m—D!(n—D! | y” pin 


(7 P) 


Problem 4.21 Which 2 x 2 matrices correspond to the Pegg-Barnett operators 
UY, Ut, and YOt W*Y, if the basis has only two eigenvalues (s = 1)? Do these 
operators behave like field operators for fermions? (4 P) 


Problem 4.22 From 0,0, = io, = —0,0, and 0,” = 1 (and cyclic permutations), 
and also o4 = A (ox + ioy), determine 0,04, 040,, 040+ and o+. What do we 
obtain therefore for U o+ Ut with U = exp(iao,), according to the Hausdorff series? 
Simplify the Hermitian operators 0,00,, 0400+, and O00404 + 04040. (9 P) 


Problem 4.23 As is well known, the position and momentum coordinates of a par- 
ticle span its phase space. Show that a classical linear oscillation with angular fre- 
quency w traces an ellipse in phase space, and determine its area as a function of 
the energy. How large is the probability density for finding the oscillator at the dis- 
placement x for oscillations with amplitude 7, if all phase angles are initially equally 
probable? (Here we thus consider a statistical ensemble.) (6 P) 


Problem 4.24 Since AX - AP > sh, the phase-space cells may not be chosen arbi- 
trarily small (more finely divided cells would be meaningless). How large is the area 
if the energy increases by iw from cell to cell? Is it possible to associate particles at 
rest with the cell of lowest energy, which would start oscillating only after gaining 
energy? What is the mean value of the energy in this cell? (4 P) 


Problems 393 


Problem 4.25 Show that the matrix (y1|P |y2) = f° Wi*@) Ë £ Ww) dx is 
Hermitian. What can be concluded from this for the expectation values (P) and 
(P?) for a real wave function? (6 P) 


Problem 4.26 Derive the 2 x 2 density matrix of the spin states of unpolarized 
electrons. Why is it not possible to represent it by a Hilbert vector? (4 P) 


Problem 4.27 Why does the quantum-mechanical expression 5 {f (X) P+P f (X)} 
correspond to the classical f (x) p according to the Weyl correspondence? 


Hint: Use ihf'(X) = [f (X), P]. (6 P) 


Problem 4.28 Justify the validity of the following quantum-mechanical expres- 
sions—independent of the representation—with a homogeneous magnetic field B 
and Coulomb gauge: A = 5B x R,P-A+A-P=B-L,andPxA+AxP= 
—ihB. 

(4 P) 


Problem 4.29 In approximate calculations for motions with high orbital angular 
momentum, we often replace (L”)/h? by the square of a number (as if it were the 
expectation value of L/A). Which number is better than /? How large is the relative 
error for l = 3 and/ = 5? (4 P) 


Problem 4.30 Is it possible to express the Poisson bracket [I - e;, a - e2] in terms of 
the triple product a - (e; x e2) if a is the position or momentum vector? (4 P) 


Problem 4.31 Derive the uncertainties AL, and AL, for the state |/, m). Hence, 
determine also (AL,)* + (ALy)* + (AL,)*. (2 P) 


Problem 4.32 Does L commute with R? and P?? (2 P) 
Problem 4.33 For classical vectors r and p, the equations 

Cxp? =p- r- př, px@xp)=rp’—pr-p, 
are valid. How do they read for the associated operators? (4 P) 
Problem 4.34 Derive all spherical harmonics for / = 0, 1, and 2. (4 P) 


Problem 4.35 Determine the integrals over all directions Q of yO *(Q), YOQ), 
and Y® (2). 


Hint: Express the integrals initially with scalar products ((2|/m). (2 P) 


Problem 4.36 For spherically symmetric problems, the ansatz 


Vim) = r" un lr) i! Y® (2) 


394 4 Quantum Mechanics I 


turns out to be useful. Using this, reduce (n/m|r cos |n’00) to a simple integral, 
given that the integral over the directions is known. 


Hint: r cos 0 corresponds to R - e; in the position representation. (4 P) 


Problem 4.37 What do we obtain for (nlm| (r cos 0)? |n'00) and (nlm| P - e; |n'00) 
with the ansatz just mentioned? (4 P) 


Problem 4.38 The scalar product of two angular momentum operators Jı and J2 
may be expressed in terms of Jj,, Jı+ and J2;, J2+, VIZ., 


1 
Jc J = z V2- +J) + Jizz - 


This helps for the uncoupled basis, but for the coupled basis, the total angular momen- 
tum J = Jı + J2 should be used. Determine the matrix elements of the operator 
0; - O2 in the uncoupled basis {|5mu, $m)} and in the coupled one {|(54)sm)}. 
How can we express the projection operators Ps on the singlet and triplet states (with 
S = 0 and S = 1, respectively) using 0; - 02? (6 P) 


Problem 4.39 Represent all d3/2 states |(25) 3m) in the uncoupled basis. (4 P) 


Problem 4.40 How many p states are there for a spin-5 particle? Expand in terms 
of the basis of the total angular momentum. (4 P) 


Problem 4.41 Which Ehrenfest equations are valid for the orbital angular momen- 
tum? In particular, is the angular momentum a constant on average for a central 
force? (6 P) 


Problem 4.42 Let y(r) ~ f(6) r7! exp(ikr) hold for large r. How large is the 
associated current density for large r? (2 P) 


Problem 4.43 How does the position uncertainty for the Gauss wave packet 
1 = 
W(k)= exp(—7 (Ak)? (k—k)*}// 22 vV Ak 


depend upon time? In the final result, use Ax(O) and Av instead of Ak. Determine 
x(t) for the case x(0) = 0. (6 P) 


Problem 4.44 Write down the Schrödinger equation for the two-body hydrogen 
atom problem in center-of-mass and relative coordinates. Which (normalized) solu- 
tion do we have in center-of-mass coordinates? (4 P) 


Problem 4.45 For the generalized Laguerre polynomials L®™ (s) and for |t| < 1, 
there is a generating function (1 — 1)~"~! exp{—st/(1 — Ð} = XLo L™ (s) t”. For 
Se Se L™ (8) L (8) ds, use this to derive the expansion 


Problems 395 
; k—m)\(k—-m 
=i" kK+D!/l!. 
(-) me) Ger) et OY 


It is needed for the expectation value (R*) of the hydrogen atom, viz., (R“) = 
Jy lul? r* dr, with 


-1-1)! sit! 
Hal = Soar r Yo 


and s = 2r/(naọ). How large is (R) as a function of n, L, and ag? (8 P) 


List of Symbols 


We stick closely to the recommendations of the International Union of Pure and 
Applied Physics (IUPAP) and the Deutsches Institut für Normung (DIN). These 
are listed in Symbole, Einheiten und Nomenklatur in der Physik (Physik-Verlag, 
Weinheim 1980) and are marked here with an asterisk. However, one and the same 
symbol may represent different quantities in different branches of physics. Therefore, 
we have to divide the list of symbols into different parts (Table 4.3). 


Table 4.3 Symbols used in Quantum Mechanics I 


Symbol Name Page number 
* | \w) Ket-vector (state vector) 282 
* | (| Bra-vector 283 
* | (ly) Scalar product, 
Probability amplitude 282 
* ily) = wr) Wave function (position 286, 320 
representation) 
* |(ply) = yp) Wave function (momentum |286 
representation) 
(n| A |n) = Ann’ Matrix element of the 290 
operator A 
* | (A) =A Expectation value of the 298 
operator A 
* | [A, B] = [A, B]- Commutator of A and B 289 


(continued) 


396 


Table 4.3 (continued) 


4 Quantum Mechanics I 


Symbol Name Page number 
{A, B} = [A, B]+ Anti-commutator of A and B | 289 
* Jat Hermitian adjoint of operator | 292 
A 
* jac! Inverse of operator A 292 
U Unitary operator 293 
(Ui =U") 
y Annihilation operator 302 
yt Creation operator 302 
R Position operator 318 
P Momentum operator 318 
L Orbital angular momentum |328 
operator 
S Spin (angular momentum) 335 
operator 
* lo Pauli operator 308 
H Hamilton operator 339 
P Parity operator 313 
T Time-reversal operator 313 
T Time-ordering operator 346 
p Density operator 323 
pr, p) Wigner function B22 
yO (Q) Spherical harmonic 332 
( Posi ) Clebsch—Gordan coefficient | 337 
mı ms |m 
* Ja Fine structure constant 623 
* lag Bohr radius 362 
LB Bohr magneton 327 
References 


Noe 


OO) SON > 


W. Heisenberg, The Physical Principles of the Quantum Theory (Dover, 1930) 
J. von Neumann, Mathematische Grundlagen der Quantentheorie (Springer, Berlin, 1968), p. 


4 


P. Giittinger, Z. Phys. 73, 169 (1931) 
D.T. Pegg, S.M. Barnett, Europhys. Lett. 6(483) (1988). Phys. Rev. A 39(1665) (1989) 

E.U. Condon, G.H. Shortley, The Theory of Atomic Spectra (Cambridge University Press, 1935) 
O.L. deLange, R.E. Raab, Phys. Rev. A 34(1650) (1986) 
M. Abramowitz, I.A. Stegun, Handbook of Mathematical Functions (Dover, New York, 1964) 
E. Stiefel, A. Fassler, Group Theoretical Methods and Their Applications (Birkhduser—Springer 


(, Heidelberg, 1992) 


References 397 


Suggestions for Textbooks and Further Reading 


25; 
26. 


. C. Cohen-Tannoudji, B. Diu, F. Laloè, Quantum Mechanics 1—2 (Wiley, New York, 1977) 

. R. Dick, Advanced Quantum Mechanics: Materials and Photons (Springer, New York, 2012) 
. P.A.M. Dirac: The Principles of Quantum Mechanics (Clarendon, Oxford) 

. A.S. Green: Quantum Mechanics in Algebraic Representation (Springer, Berlin) 

. W. Greiner, Quantum Mechanics—An Introduction (Springer, New York, 2001) 

. G. Ludwig, Foundations of Quantum Mechanics (Springer, New York, 1985) 

. L.D. Landau, E.M. Lifshitz: Course of Theoretical Physics Vol. 3—Quantum Mechanics, Non- 


Relativistic Theory 3rd edn. (Pergamon, Oxford, London, 1977) 


. A. Messiah: Quantum Mechanics I-II (North-Holland, Amsterdam, 1961-1962) 

. C. Itzykson, J. Zuber, Quantum Field Theory (McGraw-Hill, New York, 1980) 

. D. Jackson, Mathematics for Quantum Mechanics (Benjamin, New York, 1962) 

. J.M. Jauch, F. Rohrlich, The Theory of Photons and Electrons. The Relativistic Quantum Field 


Theory of Charged Particles with Spin One-half (Springer, Berlin, 1976) 


. W. Nolting, Theoretical Physics 6—Quantum Mechanics—Basics (Springer, Berlin, 2017) 
. W. Nolting, Theoretical Physics 7—Quantum Mechanics—Methods and Approximations 


(Springer, Berlin, 2017) 


. P. Roman: Advanced Quantum Theory (Addison-Wesley, Reading) 
. J.J. Sakurai, Advanced Quantum Mechanics (Addison-Wesley, Reading MA, 1967) 
. J.J. Sakurai, J. Napolitano, Modern Quantum Mechanics, 2nd edn. (Addison-Wesley, Boston, 


2011) 
F. Scheck, Quantum Physics, 2nd edn. (Springer, Berlin, 2013) 
F. Schwabl: Quantum Mechanics (Springer, Berlin) 


Chapter 5 A) 
Quantum Mechanics II coe fx 


5.1 Scattering Theory 


5.1.1 Introduction 


In simple descriptions of the scattering process, where a sharp energy is assumed and 
the time factor exp(—iwt) subsequently left out, the obvious result of this chapter can 
be stated immediately: if a plane wave exp(ik - r ) falls on a scattering center, then the 
original wave and the outgoing spherical wave f (0) exp(ikr)/r become superposed, 
and then the scattering amplitude f (0) is of decisive importance. Here the center- 
of-mass system is assumed, and the reduced mass mp and kinetic energy E = hw = 
(hk)? /2mpg are given. As will be shown in the following, for large distances r from 
the scattering center, we have (see Fig. 5.1) 


+n 1 ( i man) 
(r |k) X se exp(ik r) +O . 


Here the scattering amplitude f (0) is connected to the scattering operator S and the 
transition operator T , these being the important quantities in scattering theory. From 
the scattering amplitude, we can obtain, e.g., the differential scattering cross-section 
for the scattering angle 0 (as derived on p. 418) 


d 
— FOR. 


dR 
With these expressions we can already solve the simplest scattering problems. 
To this end, we decompose the plane wave exp(ik - r ) in terms of spherical waves: 


m 


4 
exp(ik r) = © DO Filkr) ¥9*(24)i! Y9Q, 


Im 


© Springer Nature Switzerland AG 2018 399 
A. Lindner and D. Strauch, A Complete Course 

on Theoretical Physics, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-04360-5_5 


400 5 Quantum Mechanics II 


Fig. 5.1 Scattering with scattering angle 6 (angle of deflection) and collision parameter s (see Fig. 
2.6). If s is too large, there is no scattering force 


In order to prove this equation, with p = kr, we start from exp(ip cos 0). We expand 
this in terms of Legendre polynomials. According to p. 82 (or Problem 4.7), they 
form an orthogonal system, normalized to (l+ aye >| in the variables cos 6: 


21+ 1 
exp(ip cos 0) = > sie F(p) i P, (cos 0), 
p 
l 


with the regular spherical Bessel function (see Fig. 5.2 top) 


1 
Fi(p) = F dcos P;(cos@) exp(ip cos @) . 
-1 


Note that this name usually refers to j)(9) = p~'Fi(p) = /a/(2p)Ji4.1/2(p), but for 
the expansion in terms of spherical harmonics in Sect. 4.5.2, we always wanted to take 
out a factor 1/r from the radial function, and F;(p) actually has more comfortable 
properties. In particular, Fo(o) = sin p and F\(p) = p~! sin p — cos p (since Py = 1 
and P; = cos 0), and the higher Bessel functions result from the recursion relation 


21+1 
Fizi(p) = Ea Fi(p) = Fi-1(p) , 


which can themselves be derived from the recursion relations for Legendre poly- 
nomials (see p. 82). For the rest of the proof, we still have to expand the Legendre 
polynomials in terms of spherical harmonics: 


An 
21+ 1 


l 
>> YOY Y9). 


m=—| 


Pi (cos 0) = 


For the proof this addition theorem for spherical harmonics, we rotate by a rotational 
vector @. We thus have Y D Q=} y yy (Q) 20 (@) and the rotation operator 2 
is unitary: }_„ g9 (@) g® *(@) = m'm”. If we now choose one of the two direc- 


m'm m'm 


tions Qg or Q, as the new z-direction and use Sect. 4.3.9, and in particular, the 


equations Y® (0, 0) = y (21 + 1)/47 ômo and YOO) = y (21 + 1)/4x Pı(cos0), 


then the addition theorem is proven. 


5.1 Scattering Theory 401 


Fig. 5.2 Spherical Bessel 
functions with / from 0 
(black) to 3 (blue) 
(continuous for | even, dotted 
for / odd). Top: regular F}. 
Bottom: irregular G7. In 
addition to these spherical 
functions, there are also the 
normal (cylindrical) Bessel Gi 
functions (see Fig. 5.17) 


For the regular spherical Bessel functions F; (o), we have asymptotically 


pe /Q1+ 1)! foro ~ 0, 

sin(p — 5/7) orp > ld+1), 
where the double factorial (21 + 1)!! is the product of all odd integers up to 2/ + 
1, viz., (21+ 1)!! = Tico (2n +1) = (21 + 1)!/2' I! Then, with (k’|k) = (27)~? 
fr exp{i(k — k’)-r}, we have [>° dr F)(kr) Fi(k'r) = im ô(k — k’). In addi- 
tion, it solves the differential equation 


igen ge 


( d? KAD) F i 


as do the remaining spherical Bessel functions, i.e., the irregular Bessel functions 
(Neumann functions) (see Fig. 5.2 bottom) 


(21-1)! o7, foro ~ Oand! > 0 (cos pforl = 0) , 
cos(p — lx), for p > I(l+1), 


Gi(p) ~ | 
the outgoing Bessel function (Hankel function) 
Oio) = Gilp) +iFi(p) ~ exp{i(o — 5l7)}, for p > I(+1), 


and the incoming Bessel function (Hankel function) 


nC) = Gilp) — iF)(p) = OF (p) . 


402 5 Quantum Mechanics II 


These functions are solutions of the radial Schrédinger equation 


ə? I+1) 2m 
(2 Heed = ver) Jat r)=0, 


for large r, because there V(r) will be negligibly small compared to E > 0: 
uj(k,r) © N; {Fi(kr) — x T, Oi (kr)} . 


Here, we shall actually superpose a plane wave and an outgoing spherical wave. 
Starting from the boundary condition u;(k, 0) = 0, which is necessary according to 
p. 353, so that the wave function is differentiable at the origin, and with a convenient 
slope at the origin which just fixes an inessential factor, we can integrate the differen- 
tial equation up to the point where the above-mentioned splitting in terms of Bessel 
functions occurs. Since this is also possible for the first derivative with respect to r, 
noting that the normalization factor N; cancels, the unknown transition amplitude is 
given by 


r= 1 Wu, Fi) 
1 Wu, Op ’ 
with the Wronski determinant 
OF, 0 00, ð 
W (u, F) = u — = “i F and W (u, OD) = u, — w M o. 
ðr ðr or or 


With the normalization factor N; = ./2/m/k of u; in 
k, 
(rik)t => WEY yor (Qi! YQ), 
Im í 


the asymptotic expression for (r |k)* with 
Oy(kr) ~ i! exp(ikr) 


yields the scattering amplitude 


f@)= -Z 2 (21+ 1) T; P;(cos8) , 


and we can derive the scattering cross-section from this. Note that, for low energies, 
only a few terms contribute to this series. With increasing / the centrifugal potential 
always dominates the remaining V(r), whence u; — F;, and along with it T, — 0. 

Having made this introduction with its prescriptions for proper calculations, we 
shall now proceed to investigate the scattering process in more detail. 


5.1 Scattering Theory 403 


5.1.2 Basics 


In order to clarify the basic notions of scattering theory, we restrict ourselves initially 
to elastic two-body scattering and investigate only the change in the motion due to 
the forces between the two scattering partners. Since the interaction V depends only 
on the relative distance (and possibly also on the spin) of the scattering partners, it 
is thus translation invariant, and we can disregard the center-of-mass motion. The 
centre of mass moves unperturbed, with fixed momentum. Therefore, we consider 
only the relative motion and use the reduced mass mp—keeping m for the directional 
(magnetic) quantum number. 

As already for classical collisions (Sect. 2.2.3), we assume that the partners before 
and after the scattering move unperturbed. The coupling V is assumed to have a 
finite range, i.e., it should decrease more rapidly than r~'. The Coulomb force is an 
exception, which we consider separately in Sect. 5.2.3. The ray is usually directed 
toward an uncharged probe, and then there is no Coulomb field, but it is nevertheless 
important in nuclear physics, because the screening action of the atomic shell may be 
neglected there, and only the interaction between the nuclei counts. But in the present 
discussion, we shall assume that the scattering partners act on each other only for a 
comparably short while—before and after, they are outside the range of the forces 
and move unperturbed. Each scattering is a time-dependent situation. Therefore the 
unperturbed motion must not be described by a plane wave, since this would be 
equally probable in the whole space, and there would be no “before” and no “after”. 
Instead we have to take wave packets. This we shall do rather superficially, in the 
sense that we shall not provide the exact form of the wave packet. We shall then be 
able to work out basic notions of time-dependent scattering theory. The next step 
will be to go over to time-independent scattering theory (with sharp energy) using a 
Fourier transform, whereupon the calculations become rather simple. 

The Schrédinger equation is normally taken as the most important starting equa- 
tion in any introduction to quantum theory. This is suitable for bound states, because 
their wave functions are essentially already determined by this differential equation. 
The boundary conditions are still missing, of course, but these are self-evident for 
bound states with the required normalizability and lead to the well-known eigenvalue 
problem. In contrast to the situation for unbound states (scattering states), where the 
boundary conditions still play an important role in determining the solution, only 
the asymptotic behavior is significant for many applications. Therefore, we shall 
struggle with an integral equation which contains the Hamilton operator as well as 
the boundary conditions, and then of course use the Lippmann—Schwinger equation 
to solve that. 


5.1.3 Time Shift Operators in Perturbation Theory 


In the Schrödinger picture the development of a state with time f can be given by the 
unitary time shift operator U (t, to): 


404 5 Quantum Mechanics II 


|W) = U(t, to) |W), with UC.) =1, 


and thus U (t, to) = UT! (to, t) = U' (to, t). Here, according to the Schrödinger equa- 
tion, 


0 —iH (t — tọ) 

ih UG, to) = H U(t, to) => U (t, to) = exp — ~ ; 

provided that the Hamilton operator H does not depend on time, which we assume. 
The time shift operator by itself is not enough for the description of the scattering 

problem. Initial conditions have to be added. But these refer to states in which there 

are no forces acting between the scattering partners, so not all of the Hamilton 

operator H is important, only the free (unperturbed) Hamilton operator Ho: 


H = H +V. 


We indicate, e.g., the initial state by the relative momentum p with a suitable distri- 
bution function for a wave packet. It remains unaltered only until the interaction V 
between the scattering partners becomes notable: 


[H,P] 40, but [Mo, P]=0. 


The above-mentioned Hamilton operators do not depend on time, only their effects 
on the states do. 

In addition to the full Hamilton operator H and time shift operator U (t, tọ), it is 
therefore appropriate to consider also the free operator Hp or again Uo(t, to), and to 
employ the Dirac picture. According to p. 346, we have U (t, to) = Uo(t, to) Un(t, to) 
and 

, Vo(t’, to) Up(t’, to) 
i ih 


is 
Up(t, to) = 1+ f d 
to 
with Vp(t’, to) = Uo (t, to) V Uo(t’, to). Here Uo(t, to) can be decomposed into 
Uo(t, t')Uo(t’, to), and Up is unitary, with Uo(t, to) Uo' (t', to) = Uo(t, t'). From this 
follows the important equation 


, Uolt, t) V U(t, to) 
ih , 


t 
Uta) = Unltsto) + fa 


to 
which can be derived from 


ih = {Uo(t, t) Ut’, to)} = Uolt, t) {—Ho + H} U(t’, to) 


by integrating over t’ from to to t. With 


5.1 Scattering Theory 405 
* ð / / / / 
ih = (U(t, t) Volt’, to)} = UG, t) {-H + Ho} Volt’, to) , 


we clearly have the equally important result 


j U (t,t) V Uo(t', to) 
ih ` 


t 
U (t, to) = Uolt, w f dt 


to 


These two “important” equations form the basis for all that follows. Since |W (toọ)) has 
to be given by the initial conditions, everything worth knowing about the scattering 
power of the interaction is contained in U (t, to). Note that Up is known here, but the 
question remains as to how V affects U. 

For stepwise integration, the two forms deliver the same Neumann series 


, Uolt, t) V Uo(t', to) 
ih 


to 
í O Uolt, t) V Uolt, th) V Uolt", t 
+f ar f dt” o( ) off 2 o( 0) cane 
to to (ih) 


t 
U (t, to) = Uo(t, to) +f dt 


It represents the time shift operator U (t, to) of the full problem as a sum of time 
shift operators which feel the potential only at the times 7’, t”, etc., between tọ and 
t and are otherwise determined by Ap, i.e., they are “free” (unperturbed). With the 
nth term, n interactions occur. If V changes the motion only a little, then this series 
converges fast. In the Born approximation, we terminate after the first term (with one 
V). This is often a good approximation, but certainly not for resonances. 


5.1.4 Time-Dependent Green Functions (Propagators) 


We search for the time shift operators for long time spans, because we want to connect 
the initial and final states. We shall not be concerned with intermediate states that 
cannot be measured. Therefore, we now set tọ = 0 and investigate the behavior for 
t — +00. For these convergence investigations it is better to consider the distant 
past (t — —oo) and the far future (t — +00) separately. 

Using the step function ¢(x) from p. 18 (see Fig. 5.3), whose derivative is the delta 
function, the following quantities are introduced: 


Fig. 5.3 The discontinuity e(t) 1 l e(—t) 
functions e(t) (left) and m — 

e(—t) (right). Since we have I 

e(—t) = 1 — e(t), —£(—t) - 

has the same derivative as | 

e(t), namely ô (t) - Qo A en d 


406 5 Quantum Mechanics II 


G7 (t) = a and =Go(t)= cen” i 


They satisfy the differential equations 


(in 5-4) Go =000, or (n$ -m) G0 =00, 


and are therefore called Green functions, since Green functions always solve linear 
differential equations which have a delta function as the inhomogeneous term. We 
have encountered other examples of Green functions on pp. 27, 112, and 119. In 
fact, we are actually dealing with operators, often also called propagators. Clearly, 
the functions carrying a “+” are unequal to zero only for t > 0 and those carrying 
a “—” only for t < 0. Hence we speak of the retarded (+) and advanced (—) Green 
functions (propagators). We have 


fort>0, U(t,0)=+inG*(t), Uolt, 0) = +iħ GEO), 


and use the integral equations of the last sections to derive similar ones for the Green 
functions: 


G~ (t) = G (t) + f dt’ G7 (t — t) V GĦ (r) 


= Go 6) + f dt’ GF? (t — t) V Gor). 


For Gt, the integrand vanishes outside 0 < t’ < t, and for G7, outside t < t’ < 0. 
With the higher integration limits, we may combine the equations for the retarded 
and advanced Green functions and obtain integral equations of the Volterra type. 
Here we find convolution integrals. According to p. 22, we can transform them into 
products using a Fourier transform and then evaluate the unknown G~ from Gg and 
V algebraically. 


5.1.5 Energy-Dependent Green Functions (Propagators) and 
Resolvents 


Fourier transforming the integral equations of the time-dependent propagators 


7 ii iEt y 
G#7 (E) = f dt exp — G7 (t), 
a A 


and keeping the factor v 27x, we obtain theLippmann—Schwinger equations 


5.1 Scattering Theory 407 


G*(E) = GFE) + GEE) V Gt) 
= Gp) (E) + G7 (E) V GJ (E), 


since, with t = t — f’, 


+ Ë . i , iEt + / +y 
G7 (E) = Go (E) +f af dt exp — Gylt- t)V G(r) 
-œo J-œ h 


Şi © iEt ia 
= Go (Œ) +f dt exp —— Go (T) V G7 (E). 


These equations can be solved formally: 


=x a 1 ae panee pe xg 1 
ae Gi(E)V Go (E) = Go (E) 32 VGE(E) ` 


We often write the right-hand side as a Neumann series, viz., 


G7 (E) = G (E) + GE) V GE) +>, 


and here possibly neglect the higher order terms (Born approximation). 

However, before evaluating GF (E), we must first determine the simpler prop- 
agator Go (E) of the free motion and here determine the Fourier integral. With 
Go (t) = (+ ih)! e(+t) Uo(t, 0) and Uo(t, 0) = exp(—iHot/h), we also have 


ae [oa i(E-H)t 1 [rw +i(E—H)t 
z = — ex = exp ——————,, 
0 i to P- g Fih ho P h 


where we may use an eigenvalue Eo of Ho in the energy representation. We have 
already investigated these integrals on p. 22 in the context of distributions, and found 
there 


ee f +i -4P : 
f dk exp(tikx) = - Hi (- Fir (x) , 
0 x 


xXx T10 


where (x + io)™! indicates the limiting value e —> +O of (x + ie)~! and P (Cauchy’s) 
principal value: 


forrroer fa im (f + fae. 
X x e—>+0 X 
—~oo -00 —oo +E 


This cuts out a piece around the singular point, with boundaries that converge symmet- 
rically towards this position—the cut-out region is investigated by the delta function 
ô(x) (as in Fig. 1.6): 


408 5 Quantum Mechanics II 


ip P : 
Go (E) = E-M F 1 ô(E — Ho) : 


In the following, however, we shall often use 


1 
E + io — Ho 


Go (ŒE) = 


and correspondingly for G*, or even just Go = Go(£) = (£ — Ho)~', although this 
is only unique for Im& 4 0. The Lippmann-Schwinger equations follow simply 
from the operator identity 


1 1 
A B +50 A) 5 =a +50 D3 


if we set A = E + io — H and B = E + io — Họ, then as a consequence we have 
B — A = V, and we replace the limiting value of the product by the product of the 
limiting values. In addition, we clearly have 


Gt=Gt, GF =F. 


Retarded and advanced propagators are thus adjoints of one another. 

At first glance, it may seem astonishing that we have found an expression for 
Go (E) which makes sense only as a weight function in an integrand. But we describe 
a time-dependent situation (in particular for each scattering process, we distinguish 
between before and after) and the Fourier transform t —> E obscures this situation. 
This procedure is only comprehensible if we calculate with unsharp energies (using 
wave packets, i.e., integral expressions). 


5.1.6 Representations of the Resolvents and the Interactions 


The resolvent Gj (E) = (E + io — Ho)! is diagonal in the energy representation 
{|EQ)} and also in the momentum representation {|k)} (with E = hk? /2mg), and it 
is interesting to use both representations for scattering problems: 


(E’ Q’| Go (E) |E” 2”) => (E' Q' |E” 2”) , 
E + io — E' 
(k'|k”) 2mo  (K'|k”) 


k’ GF E k” = = . 
ASG OAEIVE E+ io—Wk?/2m M k+ io- k’? 


However, the coupling V is usually given as a function of r. Therefore, we now 
search for the resolvent in the real-space representation. Using the fact that (r |k ) = 
(2s )~3/? exp(ik - r ), we find 


5.1 Scattering Theory 409 


1 2mo 3,, exp{ik’ - æ — r’)} 


(r| Go (Æ) |r’) = arji E ry pe 


The integration over the directions of k’ is easy. In particular, if we express the 
plane wave in terms of spherical harmonics, then introducing Y ® (Q) = i-!(Q|lm) 


and Y O (Q) = 1/427, the contribution for the integration over all directions comes 
only from / = 0, since fda (Im| 2) (Q|00) = (/m|00): 


Fo(ka) sin ka 
= 47 i 
ka ka 


[es exp(ik -a) = 47 


Hence the triple integral is reduced to a single one: 


[ee exp(ik’-a) 4 f k' dk’ exp(ik'a) — exp(—ik'a) 
0 


R+io-k?® ñ a k2 + io — k’? 
_ dn f™ k dk! exp(ik’a) 
~ ja 66 k2 + io —k? ` 


These integrals can be evaluated using complex analysis. The integrands each have 
two simple poles in the complex k’-plane, withk, = vk? + ioandk, = —v k? + io. 
Here, according to the residue theorem ¢§ f(z) (z— zo)! dz = 2mif (zo), the 
residues in the upper half plane are important because there the integrals over the 
semi-circle with radius |k’| vanish in the limit |k’| > oo. Then, 


a i er 
| vag 5 = —2ri (EVK £ io) LAE oe 

—oo k2 + io —k’ 2 k2 io 
—mi exp(+ika) , 


and therefore in the real-space representation, the resolvent becomes 


1 2m exp(+ik|r — r'|) 


rior = — 


It is no accident that we encountered the functions exp(+ik|r — r’|)/|r — r'| in 
our discussion of electrodynamics (see p. 255), since we were discussing there the 
scattering of waves. 

Since the momentum representation for scattering problems is actually better 
than the real-space representation (given that the momenta mark the initial and final 
states, and the free propagators are diagonal in the momentum representation), we 
now derive the matrix elements of some popular interactions in the momentum rep- 
resentation. Here we restrict ourselves to couplings, which do not act on the spin, and 
hence only involve Wigner forces, and we shall in fact focus on local and isotropic 
couplings. Then with ñq as the momentum transfer, we have 


410 5 Quantum Mechanics II 


Table 5.1 Scattering potentials and their Fourier transforms 


Potential V(r)/Vo V(q) - (V27 /a)? / Vo 
Yukawa a/r exp(—r/a) 4n/A+ aq’) 
Coulomb ajr 47 / (a? P) 

Box é(a—r) An /(a*q’) - Fı (aq) 
Gauss exp(—r?/a?) Nha 3 exp(— 1a) 


Yukawa potential 


V(r)/Vo V (q) - (V27 /a)?/Vo 
6 


box potential 


0,0 0 
0 1 2 rja 0 2 4 taq 


Fig. 5.4 Fourier transforms of several isotropic potentials. Top: Yukawa and Coulomb potentials. 
Bottom: Gauss and box potentials. With 4 =ksin 59, V (q) can be used in the Born approximation 
T(q) © V (q) for the differential scattering cross-section, as will be shown 


Oxy [or V(r) exp(—iq-r). 


(k+q|V |k) = f ar (k +qlr) V) (rik) = 
This is the Fourier transformed V(q) of the coupling, disregarding the factor 
(27)~7/. As long as V(r) depends only upon r, as in the present case, we can 
easily integrate over the directions: 


o Vq) 4r a 2 sin qr 
k+al Vik) = a = of drr V(r) ~ 


Consequently, this matrix element only depends on the modulus of the momentum 
transfer: V (q) = V (q) for each (isotropic) Wigner force. Here q = ky — kj, and 
consequently q? = kp? + ki? — 2ky - kj, so for elastic scattering q = 2k sin 10, where 
0 is the scattering angle in the center-of-mass system. 

Important examples with two parameters Vo and a for strength and distance are 
shown in Table 5.1 and Fig. 5.4, where the spherical Bessel function is 


5.1 Scattering Theory 411 


Fi (o) =p 'sinp — cos p . 
Note that the Coulomb potential turns up as the limita — oo of the Yukawa potential, 
but with aVo held fixed. We can thus take 


[0,6] 
f d?k k~? exp(—ik - r) = 47 1 dk (kr)~! sin(kr) , 
0 


because according to Sect. 1.1.10 this is equal to 47 r7! x {e(r) — +}, i.e., with 
r > 0, itis equal to 277r~!. Then k~? is the Fourier transform of ,/7/2 r~t. For the 
Gauss potential, we can use p. 23. 


5.1.7 Lippmann—Schwinger Equations 


On p. 407, we derived the Lippmann—Schwinger equations for the propagators 
G= = Go + Gp V G* = Go + G V Go. In the following, we shall generally skip 
the reference to E. Then, 


G% = Gy (1+ V G7) = (1+ G7 V) Gg., 


and also Gy = GF (1 — VG) = (1 — Gg V) G*. This leads to 


G$ = GF A +V G*) A- VGE) =(1- GEV) a+e vcr, 
Gt = Gt (1 — V GË) 1+ VG) = (1 + GĦ V) 1- GEV)G 


Here Go acts in the Hilbert space of all states of the unperturbed problem, but G> 
only in the space of the scattering states: the bound states are missing. Therefore, the 
projection operator onto the scattering states of H is now useful. Following Feshbach 
[1], we shall denote this by P. Then it follows that 


d+VG*)d-VG))=d-GoV)d+GvVv)j=1, 
d- V G) (1+ V G7) = (1+ G7 V) (1— Gy V)=P. 


We shall return to the fact that the bound states are missing in the next section. 

Before that, however, we shall also derive the Lippmann—Schwinger equations 
for the states. They are superior to the Schrödinger equation for scattering prob- 
lems, since for a differential equation, we still need boundary conditions in order to 
determine the solution uniquely. 

We denote the free states in the following by |y), but the scattering states by 
lW)* or |y)" (see Fig. 5.5). We take two different ones. In particular, we shall mark 
the “retarded” solution |y)" of H with the initial momentum—this is not a good 
quantum number because it is not conserved—and the “advanced” solution | w)~ with 


412 5 Quantum Mechanics II 


Ip) 


Fig. 5.5 The scattering states |p)* (momentum upwards) with an attractive Coulomb potential, 
represented by the classical orbital curves (calculated according to Sect. 2.1.6). From orbit to orbit, 
the collision parameter changes each by one unit. Quantum-mechanically, sharp orbits are not 
allowed—this is to be noted particularly for the straight orbit through the center 


the final momentum. Now fo should mean the beginning of the scattering process for 
|w(t))* and the end for |y(t))~. This leads to 


VA) = UC, to) |Wto))~ = Eih G*(t — to) |W Cto)” 


and in both cases |W (fo))* = |W (to)). In addition, instead of + ih Go (t — to) |W (to)), 
we may also use |y(t)). With 


G*(t— to) = GË (t — to) + J dr GĦ (t — t) V GE — to) , 


according to p. 406, this leads to the equation 


VOT = Iw) +f di’ G50 — t’) Vw) . 


Once again, the convolution integral can be transformed into a product via a Fourier 
transform (in the following, we shall again skip the reference to the energy represen- 
tation): 


IW)" = (A + GV) |y). 


With this the Lippmann-Schwinger equation holds, so (1 — Gp V)|w)* = |w), and 
hence, 


Iv)" = |v) + Gy V Iv) 


If we use the Born approximation for GF or for |y)", 


5.1 Scattering Theory 413 


Iv)" © |W) + GoV IY), 


then there are only known quantities on the right. 


5.1.8 Möller’s Wave Operators 


According to the last section, the scattering states |y)~ are related to the free states 
|v) via operators: 


ly)" =U +G°V) |y). 


These are Möller’s wave operators Q*, with the property 


QF Ww) = WW) = (pQ = FC]. 


Here, in fact, the set {|y)} forms a complete basis, but the set {|y)*} or {|w)7} 
comprises only the scattering states for H . The bound states are missing. If, following 
Feshbach as before, we introduce the projection operator P onto the scattering states 
and the projection operator Q = 1 — P onto the bound states, then 


Qt OF 24, but OF O' =1-0=P. 


The wave operators are not unitary, but only isometric, i.e., they conserve the norm. 
The wave operators Q* do not map onto the whole space, and the adjoints Q** from 
a part of the space onto the whole space. Therefore, in 


QF =P1+G*V), 


we should not forget the projection operator P—in any case, in 


Q+ (1- GEV) =P, 


we must not put 1 on the right, because QF does not lead to bound states. On the 
other hand, with (1 — G7 V) G* = Gg and G t = GF, we have 


Q GJ =PG} 4 GQ =GP, 


and with Q* = P (1 + G*V), the Lippmann-Schwinger equation 


Qt =P + Q GEV 


for the wave operators. For the adjoint operators, we then obtain the equations 


414 5 Quantum Mechanics II 


Q = (1+ V GF) P=P+VG§EQ*', 


or (1 — V Gf) Q+* = P. While Q* maps the free states to the scattering states of 
the full system, conversely, Q*' maps the scattering states to the unperturbed system, 
and the bound states |y)? to zero vectors, Q** |y)? = |o). 

Incidentally, we also have 


H QË = QF Ho, 


since for all eigenstates of the energy, we have HQS|y) = H |y)" = Ely)”, and 
the quantum number E commutes with the wave operators Q*, so 


EQ |y) = QEY) = 27 Holy) . 


5.1.9 Scattering and Transition Operators 


We shall now look for the transition probability from the initial state |y;)* to the 
final state |y;)~, or more precisely, the amplitude ~(y¢lwi)* = (Yel QRT Ipi). 
Note that this does not depend upon time, because |y)" and | w)~ relate to the same 
Hamilton operator H. The free states form a complete basis. Therefore, we follow 
Heisenberg and introduce the scattering operator 


S= QÏ Qt, 
which relates the initial state directly with the final state: 


(Wel S Iyi) = ~ (elyi) - 


If we know its matrix elements, then the scattering problem is essentially solved. 

It remains to show that the scattering operator is unitary, even though the 
wave operators Q* are only isometric. With S'S = QtTQ-Q-*Qt and SS* = 
Q RQF OT, we therefore investigate Qt QF QF QE = Q+ P QF. Since Q* 
maps only onto the space of scattering states, we have PQ* = Q*, and thus 
Q**Q+ = 1 is left over. The scattering operator is thus unitary: 


SİS = SS =1. 


Unitarity guarantees, among other things, that nothing is lost in the scattering process, 
whence the norm of the original wave remains conserved. 

In order to show the influence of the interaction V as clearly as possible, we 
reformulate the transition amplitude. With 


5.1 Scattering Theory 415 


It — Wi)” = (G*-G) V |p) = —2ri 5(E-A) V |i) , 
* (Wel — ~ (Wel = (Wel V (G7 —G*) = +2771 (Wel V 8(E—H) , 


we have in particular, 


(Wel S Wi) = ~ (Weld) =~ (Wel Wi)” — 2a (EE) ~ (el V Iyi) 
=+ (Wel Wi)? — 270i 8(Er— Ej) (Wel V Wi) 
Given the isometry of the wave operators, we have ~ (yry) = (Wel Wi) = 


*(wWelwi)?. Furthermore, the delta function (Er — Ei) can be extracted and this 
ensures conservation of the energy: 


(Wel S Wi) = 8 (Er — Ej) {(82¢| 24) — 2ri (Wel T Wid} , 
where the transition operator is defined by 
Tar Vveva. 
Here the expressions are only to be evaluated “on the energy shell”, i.e., for Ey = Ei. 
Then we have Gj T = G} Q7} V = Gt P V. Since G* acts only in the P-space, 
we write for short 


Gree. o TG} =VG". 


Then for the retarded propagators, 


G+ = Gj +G} TG; 


from the Lippmann-Schwinger equations. Correspondingly, from T = V Qt = 
V P (1 + GĦ V), we deduce the Low equation 


T=V+VG'V. 


According to the above equations, the Lippmann—Schwinger equations are valid for 
the transition operator T: 


T=V+VG)T=V4TGYV. 


These equations are particularly useful, because the transition operator T is directly 
connected to the scattering cross-section and indeed other experimental quantities 
(observables), as we shall see in the next section. 

In the Born approximation, we replace T by V and thereby avoid having to com- 
pute the resolvents. Then, however, Gt V must not be too large, which is why the Born 
approximation fails for resonances. Note finally that, in the Lippmann—Schwinger 


416 5 Quantum Mechanics II 


equation for T, different energies can occur in bra and ket, whereas for two-body 
scattering, they do not contribute. 


5.1.10 The Wave Function (r |k)* for Large Distances r 


We now consider the real-space representation of the scattering states |k )* in the rel- 

ative coordinate r of the two scattering partners. The limit r > oo will be important 

for the scattering cross-section, with which we shall occupy ourselves subsequently. 
Particularly convenient is the starting equation 


|k)* = (1+ Gp T) |k), 


because we have already found the real-space representation of Go on p. 409: 


C5 i= 1 2mo exp(Æikļ|r — r'|) 


G7 
a 4r fi r—r’| 


2mo 


Forr > r and|r—r’'|&r/1l—2r-r//r?? ~ r-r. r'/r (see Fig. 3.30), and with 
the abbreviation 


the last expression goes over into 


(as mire 1 2mo exp(+ikr) 


rea paul 
on an E 


exp(+ik’-r’). 


Here, exp(—ik’ - r’) = (27)? (k’|r’). Therefore, we have (see p. 399) 


J 27 m ; exp(ikr) 
(kt ~ (rk) — E ae 7 ky SE 
1 ik. 
ESE (exvck- x) + ro i 2) ; 
V27? r 
with scattering amplitude 
fO =- (ZF m wT i =- (Eo T E) 
= h 0 = k f ifs 


For the second formulation here, note that |k ) = |EQ) h/./mok, which follows from 
(k|k’) = k-75(k—k’) 6(Q—Q’) and 6(E—E’) = moh 6(k2—k) with 3(k?— 
k’) = (2k)! 5(k—k’) (see p. 20). Here we recognize the difference between the 


5.1 Scattering Theory 417 


wave vector and the energy representations. We have already discussed the differ- 
ence between the wave vector and the momentum representations on p. 319. Here 
Q; gives the direction before scattering and Qr the direction afterwards. If there is 
a Wigner force—no spin dependence—only the scattering angle 6 between the two 
directions is important, because for rotational invariance the transition operator in 
the angular momentum representation is diagonal and does not depend upon the 
directional (magnetic) quantum number: 
(Al T (2) = $ (Qim) T, (m) = X Y9Q Ti YP) 


Im lm 


21 
= 2a a T; P;(cos@) . 


It follows that f (0) = —(/k) $`, (21 + 1) T, P; (cos 0), as claimed on p. 402. 


5.1.11 Scattering Cross-Section 


Scattering cross-sections are not the only observables in scattering processes. For 
particles with spin, polarizations (i.e., spin distributions) can be measured. But in 
that case, only the angular momentum algebra need be applied. The basic notions can 
be explained with the example of spinless particles, and we shall restrict ourselves 
here to this essentially simple case. 

The differential scattering cross-section do /dQ is given by the number of particles 
scattered into the solid angle element dQ relative to the number of incoming particles 
per area unit and the number of scattering centers. (For stationary currents, we have 
to refer to equal time spans in the numerator and denominator. In addition, the 
expression does not hold if the incoming or outgoing particles interact with each 
other, or if the individual centers scatter coherently, as for the refraction of slow 
neutrons in crystals.) We can also express the scattering cross-section in terms of the 
current densities of the scattering wave and the incoming wave: 


do jscat(&) r? 
dQ jj 


Here it is well known that, in the real-space representation, we have (see p. 348) 


hoy Vy — AL 


2mo 


j@=- 


and from Wecat(t ) ~% (2r)? exp(ikr) f(0)/r and y(r) = (2s) ~3/? exp(ik -r), 
we obtain the current densities 


418 5 Quantum Mechanics II 


. 1 hk . — 1 Ak OJK 

JES 2r)? mo ’ Jscat ~ Ory mo i) : 

Therefore, the differential scattering cross-section can be evaluated from the scatter- 
ing amplitude f and the transition matrix T as follows: 


do 


= 2 
gg 7V Ol = 


(20)* 
k2 


(EQ T |E)? , 
if we also use the last section for the relation between f and T. 

Using (EQ| S |E’Q’) = (EQ\E'Q’) — 2mri(E|E') (EQ\|T|EQ’) and the unitarity of 
the scattering operators, viz., S'S = 1, which expresses current conservation, we 
obtain 


i(EQ\T|EQ’) — i(EQ'|T|EQ)* = 2x fao (EQ"|T|EQ)*(EQ" TEX’) , 


after splitting off the factor 27 6(E — E’). With Q’ = Q, this implies 


k? d 
—2Im(EQ|T|EQ) = 2r fas’ (EQ! |T|EQ) 2 = —— [as T, 
EE dQ’ 


and what is known as the optical theorem relating the integrated scattering cross- 
section and the forward scattering amplitude: 


(22)? 
0 = ke 


4r 
(—2Im(EQ|T|EQ)) = gyro ; 


To first order in the Born approximation, the forward scattering amplitude is real, 
which contradicts unitarity. In fact, for the forward scattering amplitude, at least the 
second order is necessary. 

If there are other processes in addition to elastic scattering, such as inelastic or 
even disorder reactions, then o in the last equation stands for the sum of all integrated 
scattering cross-sections, the total scattering cross-section, because we have to insert 
a complete basis in order to arrive at |T|? when computing T T. 


5.1.12 Summary: Scattering Theory 


In the scattering theory, we investigate how an original state is transformed into a new 
state as a consequence of a perturbation V. In addition to the quantities associated 
with the unperturbed system, i.e., the Hamilton operator Ho, the time shift operator 
Uo, the propagators (Green functions) Gj, and the states |y), there are quantities 
associated with the (full) perturbed problem: the Hamilton operator H = Ho + V, the 
time shift operator U, the propagators G~, and the states |y)*. These quantities are 


5.1 Scattering Theory 419 
related to each other, in the time-dependent case via integral equations, in the energy- 
dependent case via the Lippmann—Schwinger equations. The scattering operator S, 


or again the transition operator T, describe the transition from the unperturbed initial 
state to the unperturbed final state. 


5.2 Two- and Three-Body Scattering Problems 


5.2.1 Two-Potential Formula of Gell-Mann and Goldberger 


This formula is important for many applications of the generalized scattering theory 
and starts from 


V=V+S8V, 


where the approximate scattering problem for V is considered already solved, so that 
the propagator for Hp + V is known, viz., 


Č =G (1+ VG)=(1+GV) Go, 
along with the transition operator T: 


T=V(1+GoT)=(1+TG)V. 


Note that, from now on, we shall usually skip the indices + and the argument £. 
According to p. 415, we also have GoT = GV and TGo = = VG. In addition, using 
G = Go + Go (V + 6V) G, which implies (1 — GoV) G= Go (1 F òV bi then 
multiplying by 1 + GV and using the relation (1 + GV)(1 = GoV) = = (l= 
VGo)( + VG) found on p. 411 (with 6V = 0), we deduce that 


G=G(1+8VG)=(1+G8V)G, 
where we just write G instead of PG or GP once again, since we restrict ourselves to 
iene states anyway. Another proof this equation follows using G7 1_€_H, 
G- = £ — Ñ, and òV =V—-V=G"'!-G"!: 
GV G=G-G=G0VG. 
We thus have a Lippmann-Schwinger equation in which, instead of the full coupling, 
only the “perturbation” 5V appears, but with G instead of the free propagators Go. 


According to the last equation, we have 


(4 GbV)l + 6GVij1465V £67 +(6=G)V =14-67 


420 5 Quantum Mechanics II 


This factorization of 1 + GV is useful because then, from |y)~ = (1+ GF V)|y), 
with the states deformed by v Iy) = (1+ G+ Vly), we have the helpful relation 


[W)* = (1 + GĦ òV) |y)? 

Note that 1 + VG factorizes into (1 + VG)(1 + 8V G). 

For the Low equation T = V (1 + GV), we can also use this kind of factorization. 
With 

VA+GSV)=V+0+VG)V, (1+VG)6V =(14+ VG)14+5VG) SV, 
and the modified Low equation 
ST = (1+ 8V G) &V , 

along with T = V (1 + GV), we obtain the formula of Gell-Mann and Goldberger 


T=T+ 0+ OST d2eVieT +027 Go) 8T A +6GoT), 


which is extremely useful here. 
For the matrix elements of the transition operators, we thus have 


(Wil T Iyi) = (Wel T yi) + ~ (Wil ST Iy)? 


If we take the Born approximation ôT ~ dV for ÒT here, we obtain a better Born 
approximation known as the distorted-wave Born approximation (DWBA). Whereas 
all higher order terms in V are left out in the Born approximation, now only those in 
SV are missing. However, the states |y)* (distorted by V) still have to be calculated, 
as does T. 

Note that we also have 


(1+ G8V)1—- G&V) =1=(1-G8V)14+GSV), 


since the a is equal to 1 + (G — G — G8V G) SV, and we have already 
proven G — G = G ôV G. Consequently, multiplying |Y)" = (1 + G 8V )|y)* by 
1 — G8V, we T Iy)? = (1 — G8V)|w)*, or the Lippmann-Schwinger equation 


IYE = |w)= + G* èV |W). 


We shall refer to this in Sect. 5.2.4. 


5.2 Two- and Three-Body Scattering Problems 421 


5.2.2 Scattering Phases 


This result will now be explained using the methods mentioned in Sect. 5.1.1. There 
we introduced the spherical Bessel functions 


F; ~ sin(p — 311), O, © exp{ti(o — 31 7)}, 
Gı ~ cos(p — 31), I © exp{—i(p — 51 1)} , 


and expanded the radial function of the Schrödinger equation with respect to two of 
them in the region with V = 0. If V vanishes everywhere (and hence the transition 
operator along with it), then the function F; alone suffices, because only this is 
differentiable at the origin. Generally, u, ~ N; (F; — a T; O;), where N; ensures the 
correct normalization. Given the unitarity of the scattering operators, we set 


Sı = exp(2iô;) , 


and make use of S; = 1 — 271 7;. Then, 


exp(2i6,;) — 1 


=T Tı = 
2i 


= exp(id;) sin ô; , 
and with F; = (O; — I) / (2i), it follows that 
2i u/N;ı © O; — Iı + {exp(2i6;) — 1} O; = exp(2iô;) O; — Iı , 
so 
u œ N; expli) sin(o — lr + ô) . 


In order to fix the scattering phase ô, not only mod 7, we also require it to depend 
continuously on k and vanish for k —> oo, because for E — oo, the coupling V 
should be negligible—to the (repulsive) centrifugal force there clearly corresponds 
the (negative) scattering phase — Hrn, independent of the energy. Note that, on the 
other hand, according to the Levinson theorem, the phase shift for k = 0 is equal to 
x times the number of bound states. 

After these preliminaries, we introduce the scattering phase ô; associated with V, 
and in addition to Ò; = exp(id)) O1, we use 


Ý x cos 8) Fı + sin ô G; = cos ô; Fi + sin ô; (O; — iF) = exp( iô) F, sin 3) 0. 


With this we obtain for u; asymptotically the expression N; {F; — x Tı O1}, and instead 
of the curly brackets, we may also write 


exp(id)) {F; — sin 6; exp (—id)) O) — zT, exp(—2i8)) Õı} 
= exp(id;) {F; — exp(—2iô;) O; [expGô;) sind; + 7 T,]} . 


422 5 Quantum Mechanics II 
Since we now have to set exp(id)) sin ô; = -m T, we obtain 
uy © N; exp(ið;) {F) — x (T, — Ti) exp(—2i8)) Òi} . 
From this we can conclude that we should take 
T; = T, + exp(2id)) 87, , 
which corresponds to the two-potential formula. Here, the factor exp(2id)) originates 


from the distortion of the states due to the coupling V, because we have used the 
functions F; and O;. 


5.2.3 Scattering of Charged Particles 


An important application is to scattering by the Coulomb potential, since it decreases 
so slowly with increasing r that the previous results cannot simply be carried over. 
Here we use the Sommerfeld parameter (Coulomb parameter) 


zZe mo 
Amey hk’ 


n 


together with the Coulomb scattering phase 


; . PG iin) 

=argrd+1+i => exp(2io;) = —____ . 
o1(n) = arg T ( n) PRI) = Fas i 
The spherical Bessel functions are now replaced by the Coulomb wave functions 


Fi(n, p) ~ sin(p — nln2p — Six + 01) , 
Oi(n, p) © exp{+i(p — nln2p — sla + 01)} , 


where the logarithm originates from the long range of the potential in the radial 
Schrödinger equation 


d +1 2 
( GED ay 7) m) =0, with p = kr. 
dp? p? p 


Note that with the bound states stands —1 instead of +1 for the energy and the 
principal quantum number n instead of —7n (see p. 362). Despite the long range, we 
can introduce a Coulomb scattering amplitude 


n exp{2i (oo — n In sin 56)} 


0) = 
ee’: sin? 19 


’ 


5.2 Two- and Three-Body Scattering Problems 423 


and hence determine the Rutherford cross-section 


2 


= ROP = — — 
dq “S 4k? sin (40) ` 


With fc(0) = —(27)*k7! (Q¢|Tc|Qi), the matrix element of the transition operators 
for the Coulomb problem follows from the scattering amplitude: 


i exp{2i (oo — n lnsin 5 20) 
(27 sin 50)? 


T(0) = 


Incidentally, its modulus £ z1 (27 sin 50)~ 2 is equal to (EQ,| Vc |EQ;), because with 


zZe* n Rek 
V, = = ; 
cC) 4T EQr mor 


we have 
l apoc als 
(ke| Velki) = mh km (2x k sin 59) ; 


according to p. 410, and in addition, 
(EQ4| Ve |EQi) = mokh™ (kil Ve Iki) , 


according to p. 417. Only the phase is missing from the Born approximation! 

We thus have F, (p)= = = Fin, p) and d O;(p) = = 0O,(n, p) for the scattering of charged 
particles, along with T= Tc (0) and i =o7/(n). Further forces (e.g., nuclear forces) 
then contribute in the term 67). 


5.2.4 Effective Hamilton Operator in the Feshbach Theory 


A further important application of the two-potential formula is the unified theory of 
nuclear reactions due to Feshbach (see p. 411). This leads us to a deeper understand- 
ing of all resonances and direct reactions (not only in nuclear physics) and embraces 
several other resonance models. 

The decisive point of the Feshbach formalism is the separation of the Hilbert 
space into two parts, on which we project with the operators P and Q: 


P = P = P?, Q=0=0°, PQ=0=QP, P+Q=1. 


P maps onto those states which do not vanish for large r, viz., the scattering states 
describing open channels, and Q onto the “bound” states, which vanish for large 


424 5 Quantum Mechanics II 


r and describe closed channels. This division considers only large distances of the 
scattering partners (asymptotic boundary conditions) and allows several cases for 
short distances. Therefore, different resonance theories are still possible. If we intro- 
duce, e.g., a channel radius R with the property that the interaction vanishes for 
larger distances, we may let Q project onto the space 0 < r < Rand P onto the space 
r > R: this leads to the scattering matrix of Wigner and Eisenbud [2] (see also [3]). 
(It differs from the transition matrix of Kapur and Peierls [4] in that the boundary 
conditions for r = R depend upon the energy.) In the Feshbach formalism, there is 
no need for the channel radius R. 

Along with the division of the Hilbert space into open and closed channels, we 
also have to decompose the Hamilton operator correspondingly: 


H = (P + Q) H (P + Q) = Hpp + Hpo + Hop + Hoo . 


For the scattering cross-section, only P|y)* is important. We now search for the 
“effective” Hamilton operator acting on these scattering states, and after that derive 
the associated Lippmann—Schwinger equation. 

To begin with, from (E — H) |w)* = 0, after projection with P and Q such that 
1 = P? + Q?, we have the general result 


(E — Hpp) P\y)” = Hro Qly)" and (E — Hag) QIY) = Hop Ply)” . 


Since Q projects onto the closed channels, the inhomogeneous term is missing in its 
Lippmann—Schwinger equation: 


1 
E — Hoo ` 


Qly)" = Go Hor P\w)*, with Gọ = 


If we insert this into the other relation, we obtain the homogeneous equation 


(E — Hpp — Hpg Go Hop) P\y)* =0. 


We thus find the effective Hamilton operator Hpp + Hpg Gg Hop. Clearly, it can be 
used for the two-potential formula: Hpp plays the role of Hp + V and HpgGoHop 
that of 5V. However, from now on, we write Gtp = (£ — Hpp)~! with complex &, 
instead of G*, and according to p. 420, we now have 


~ a . fy 1 
P |y)" = |y) + G*p Hro Go Hop P\w)~ , with Gop = ZHL’ 
— Hpp 


as the Lippmann-Schwinger equation for the unknown scattering state. 


5.2 Two- and Three-Body Scattering Problems 425 


5.2.5 Separable Interactions and Resonances 


The key feature of the new residual interaction 5V = Hpo Gg Hep is the product 
form. Such couplings are said to be separable. They can be diagonalized in the 
space of scattering states and therefore not in real space, and are thus non-local: 
(r |V|r’) A Vo(r) d(r — r’). The transition operator 5T now also factorizes, because 
the relations 87 = V (1+ GV) and 1+ GV = (1—GS5V)~! mentioned on 
p. 420 deliver 87 = 8V /(1 — G8V): 


1 


8T = Hpo Go H ' 
POE EOP 1 — Gtp Hpg Go Hop 


Here, A (1 — BA)~! = (1 — AB)~!A with (1 — AB) A = A (1 — BA), and thus 


1 1 


H = Hop. 
2 1—GtpHpgGgHop 1—HopGtpHrgGg F 


With {Go d- HopG*pHpoGo) '}7! =(1- HopG*pHpgGo)Go |, the operator 
between Hpo and Hgp can also be simplified: 


1 


oT = Hpo Hop r 


Here, since 


Hop G*p Hpo = Hor ( ix ô (E Hpp)) Hpo =A- sir ` 


P 
E — Hpp 


it is clear that the poles do not occur at the eigenvalues of Hgg, but are displaced by 
the level shift A, and have a level width T (see Fig. 5.6): 


1 
(E — Hog — A} + 41? 


IT? ~ 


’ 


We will discuss these resonance parameters in the next section. When considering 
dT, the coupling Hgp which leads from the P- to the Q-space is initially important, 
then the resonance level in Q-space, and finally again the coupling Hpg which leads 
back from the Q- to the P-space. 

Near the resonance, 


—i (Hoo + A — 4il) t 
h 9 


vO] ~ exp 


426 5 Quantum Mechanics II 


Fig. 5.6 Lorentz curve i / (x24 D) (continuous red), line shape of a scattering resonance about the 
resonance energy Ep with half-width T, where x = (E — Eg)/T. The curve has half the maximum 
value at two points which have the distance of the half-width (dashed blue). For this distribution, 
AE = œ and the associated average lifetime is A/T 


and consequently, 


—t ; h 
, with t=, 


-Tt 
DH~ = 
lye" O| ~ exp 7 exp T 


T 
where t is the average lifetime of the resonance state. We can also view it as the time 
uncertainty of the state, because it is now 1? — t? = t”. The associated distribution 
function |y+(E)|? in the energy representation is given by a Lorentz curve (with 
infinite energy uncertainty according to Problem 4.3). Therefore, the equation t IT = 
h, which is a lifetime—half-width relation, is not a time—energy uncertainty relation, 
even though this is often claimed—there is no Hermitian time operator in quantum 
theory and hence there is also no such inequality, even though each finite wave train 
has a finite time and frequency uncertainty, even classically. 


5.2.6 Breit-Wigner Formula 


There are various methods for computing 


1 
ôT = Hp Hop . 
2 E — Hoo — Hop Gtp Hro | 


In order to proceed without approximations, we have to diagonalize the denominator, 
which means searching for the eigen representation of 


P 
H'= Hoo + Hop G*p Hpo , with Gtp = E_H it 5(E — Hpp), 
— Hpp 


where the last term is not Hermitian. Therefore, we now need two sets of solutions 
(a bi-orthogonal system) in the Q-space, 


5.2 Two- and Three-Body Scattering Problems 427 
{&, — H’ (E) |B, (£) = 0 
{Gx -HEN EO) =0 > (84 {&-H(6)} =0 


with (E4 (£) | By(&)) = ôw and >, |B, (&))(E4(&)| = Q. The eigenvalues &, of 
H' are complex, and 


Q O a EE) (EAA 
pw 7h E-&, 


holds. Here Gtp still depends on the energy, and therefore also on H’ and the whole 
bi-orthogonal system. This seriously complicates the computation. 

These difficulties can be avoided with an approximation. We take the eigenstates 
of H, QQ» 


(En = Hog) |n) =0 ; 


e.g., those of the box or quadratic potential (see Sects. 4.5.3 and 4.5.4), and obtain 
the shift and width according to perturbation theory from 


Kn|Horly (E’))* /? 
E — E' 
~ A,(E) — 3 il, (E) . 


iz |(n|Hop|W(E))* |? 


(n|Hop G*p(E)Hpo|n) = p far 


For elastic scattering, this leads to the Breit-Wigner formula 


With the level width I,,, the terms for all real energy values remain finite. This is 
similar to the result that only finite amplitudes are permitted, as for the damping of 
a forced oscillation (Sect. 2.3.8) . 


5.2.7 Averaging over the Energy 


So far we have assumed that the energy can be arbitrarily sharp. Actually we should 
not do this, but rather calculate with mean values. Even disregarding this aspect, it 
is instructive to given an overview of the average behavior. 

We denote the mean values as usual with angular brackets or bars and use suitable 
weight factors p(E, E’) to compute them, as in 


428 5 Quantum Mechanics II 


fE) =f Œ) = [ae P(E, E)f(E) , 
where 
p(E,E')=0, for |E—E'| >I and foe P(E, E)=1. 


The Lorentz distribution is analytically convenient: 


I 1 


E, EB’) = 
EE) pe Ean RRA 


It is symmetric in E and E’ and has a maximum for E’ = E, while the half-width 7 
does not lead to cumbersome boundary effects, as the box distribution does. However, 
the Lorentz distribution does not have a finite energy uncertainty AE—only the 
half-width is finite. For a test function f (E) which is regular in the upper complex 
half plane and vanishes sufficiently fast for large |E|, we have by the Breit—Wigner 
formula, 


tO= gz: with & =E,- Lin, T,>0. 


The residue theorem then implies 


dE’ An 
E) = 
FE) le HNE — Fain YEG 
a 
Lett fe ——— l ee (eas 
Qn DASE Tan 


While the limit E + io has been necessary so far, the average now already leads 
to a complex energy: the imaginary part is equal to half of the half-width of the 
distribution function. In then averaged scattering amplitude, the level widths are thus 
broadened: 


= (wel Veg (En) (E41 Vor Iyi)? 


(el T Wi) = (el T Iya) ae Poe NAD) 
n 2 n 


Here we have assumed that T does not depend strongly on the energy. The interval 7 
of the averaging procedure may be large compared with the resonance widths I’,,, but 
it must nevertheless be so small that the average T is not altered. (T comprises only 
the broad “potential resonances”, and ST the narrow “compound nucleus (Feshbach) 
resonances’’.) 


5.2 Two- and Three-Body Scattering Problems 429 


5.2.8 Special Features of Three-Body Problems 


In the rest of this section, we shall treat a special aspect of the scattering theory which 
in fact does not belong to a standard course on Quantum Theory II, although it is 
nevertheless important and instructive. If three partners 1, 2, and 3 are involved in a 
reaction, then there are many more reaction possibilities than for only two of them. 
If initially, e.g., 2 and 3 are bound to each other and form the collision partner for 1, 
then the following transitions are possible: 


1+ (+3) ~~ 1+ (243) elastic (and inelastic) scattering, 
— 2+(3+41) disorder reaction, 
—> 3+ (1+2) disorder reaction, 
— 1+ 2+3 fission reaction. 


For fission, one partner can initially also leave the interaction regime, while the 
others stay together for a while. We then speak of stepwise decay, and of a final-state 
interaction between first and second decay, even though this “final state” also decays. 

If we trace the reaction back to two-body forces (not including many-body forces), 
then we must nevertheless be careful to distinguish between genuine three-body 
operators and those for which the unit operator for one particle may be split-off, then 
multiplied by a two-body operator for the two remaining particles. For example, for 
the interaction between particles 2 and 3, we write 


If the particle is involved, then its number appears up, if it is not involved, then it 
appears down. Lower-case letters now indicate two-body operators. For two-body 
forces, we then have V = V? + V?! + V? = Vi + V + V3. 

Since for the disorder reaction | + (2 + 3) —> 2+ (3 + 1), initially V; and then 
Vz leads to a bound state of the corresponding pair, instead of the free Hamilton 
operators Ho, we clearly now need the channel Hamilton operators 


Ha = Ho + Vx, 
and the “residual interaction” is 
V°=V-V=H-H,. 
V“ thus contains all two-body interactions involving œ, e.g., then V! = V? + VB = 
V3 + V2. 
In order to capture the fission, let us also allow a = 0, i.e., a € {0, 1, 2, 3}, and 


require Vp = 0 or V° = V. In addition to the full resolvent G, we also introduce 
channel resolvents Gy: 


430 5 Quantum Mechanics II 


“Cr {ECH E 


Fig. 5.7 Unconnected graphs for three-body scattering. Here partner 1 is not involved and delivers 
useless factors. Left: Vj Center: Tı. Right: Vi GoT;. Partners 2 and 3 participate in the two-body 
scattering 


1 
E-H, ` 


1 
G(é) = ZH’ Ga(£) = 


Then according to Sect. 5.2.1, the Lippmann—Schwinger equations are valid: 


Ga= (1+ Ga Vo) Go= Go (1+ Va Ga), 
G =(1+G V%) Gga= Ga (1+ V“G). 


These equations are in fact correct, but the last row does not fix the unknown resolvent 
G uniquely. Here we would have to invert the operator 


1 — Go V“ = 1 — Go (1 + Vx Ga) V“ = 1 — Go V“ — Go Va Ga V” . 


But with V! = Vz + V3 (with w = 1), it contains the parts Go Vz and GoV3, and 
hence different unit operators (the “non-involved part”, unconnected graphs) (see 
Fig. 5.7). In the energy and momentum representations, this leads to delta functions, 
and in the real-space representation to divergent integrals, which requires another 
approach. Note that such problems do not occur for Vy Ga V“, because all parts are 
involved. 


5.2.9 The Method of Kazaks and Greider 


One possibility for solution is a method due to Kazaks and Greider [5]. As for the two- 
potential formula, we deal initially only with parts of the interaction. In particular, 
we take the transition operators for the two-body scattering to vy (with a Æ 0), 


la = Va (1+ 80ta) = (1 + to 80) Va > 


and use the energy E — Ey in go. We leave the particles œ untouched and begin by 
solving the scattering problem for the two remaining partners. Then we may also use 


Ty = ty 1% = Vy (1 + Go Ta) = (1+ To Go) Va , 


5.2 Two- and Three-Body Scattering Problems 431 
with 
(1— Go Va) C +GoT,)=1 and 7,672 6x. 


and we need T1, T2, and 73. Then witha Æ B A y Aq, and thus V* = Vg + V,, 
we obtain 


1 — Go V® = 1 — Go Vg — Go V, = (1 — Go Vg) {1 — (1 + Go Tp) Go Vy} . 


The last factor is equal to 1 — Go V, — Go Tg Go V}, and with T, = (1 + T, Go) Vy, 
or V, = T, (1 — GoV,), it may also be factorized: 


1 — Go V; — Go Tg Go V, = A — Go Tg Go T,) (1 — Go V,) . 
Consequently, (1 — GoV%)~! can be decomposed into three factors: 
(1— Go V2"! = (1+ GoTy) (1 — Go Ts Go T)™! (1 + Go Tp) . 


Here £ and y may be exchanged. Therefore, for the transition operator T” associated 
with V” = Vz + V, (witha Æ 0), we obtain 


T% = (1+ T” Go) V = V® (1+ Go T®) = V% (1 — Go VOY! , 


along with Vg (1 + GoTg) = Tg and V, (1 + GoT,) = T,, and hence the expression 
(see Fig. 5.8) 


T* = Tg (1 — Go Ty Go T)! (1+ GoTy) 
+T, (1 — Go Tg GoT,)~'| (1 + Go Tp) . 


The initially non-invertible operator 1 — Ga V“ with V* = T” (1 — Go V“) may now 
be split-up into a product: 


1 — Ga V* = 1 — Go V* — Go Ta Go V“ = (1 — Go Ta Go T”) A — Go V®). 


Fig. 5.8 Connected graphs for three-body scattering. Here we consider the example T3 GoT1. These 
arise for the method of Kazaks and Greithe and also for the iterated Faddeev equation. The scattering 
problem is therefore soluble 


432 5 Quantum Mechanics II 


Both factors are invertible. In particular, (1 — Go V“) (1 + Go T”) = 1. Therefore, 
for the unknown resolvent G from (1 — Ga V“) G = Gy, we have the unique result 


G = (1+ GoT*) (1 — Go Ty Go TO! Gy . 


The operators Ty, Tg, and T, are extremely useful for solving the problem. Only by 
controlling the two-body scattering can we treat the three-body scattering. 


5.2.10 Faddeev Equations 


In the last equation, G may be decomposed into three parts: 
G=G4'4+@4+G4, 
where (with a = 1) 
G! = Gi + GoT; (G? + G°) 


Go Tə (G! + G?) 
G= Go T; (G! + G3). 


S 
Í 


Hence, 


G? = Go T2 G! + Go Tə Go T3 (G! +e) 
= (1 — Go Tə Go T3)! Go To (1 + Go T3) G! . 


Using (1 — A B)™! A =A (1 — BA) “|, this is equivalent to 


G? = Go T> (1 — GoT3 Go Tr)! (1+ Go T3) G' , 
G? = GoT3 (1 — Go T> Go T3)™' (1 + Go T2) G'. 


We then also have G? + G? = Go T! G! and thus G! = G1 + Go Tı Go T! G!.If we 


solve with respect to G!, then we find G! = (1 — Go T; Go TH)! Gi. Consequently, 
the initial equation is equivalent to 


G = (1 + GoT!) G! = (1 + Go T!) (1 — Go Ti Go TH! Gi - 


This expression for the resolvent G was also derived in the last section. Hence, if the 
initial state has a = 1, we have proven the Faddeev equations 


G! Gı 0T T G! 
G? — (0) + Go Tə 0 T G? š 
G? 0 T; Ts 0 G? 


5.2 Two- and Three-Body Scattering Problems 433 


which deliver G = G! + G? + G?. After an iteration, they have a unique solution, 
because then only connected graphs occur: 


G! G1 Tı Go(T2+T3) T|GoT3 T|GoT2 G! 
G? | = | Go T2 G1 | + Go T2GoT3 T2Go(Tı+T3) T2GoTı G? |. 


G Go T3 G1 T3GoT2 T3GoT, = 13G0(T,+T2) 


More details can be found in the book by Schmid and Ziegelmann [6]. 


5.2.11 Summary: Two- and Three-Body Scattering Problems 


Here we presented the generalized framework for scattering theory, followed by 
several important applications. These made use of the two-potential formula (V = 
V+ dV) due to Gell-Mann and Goldberger: T = Ta d+ TGs) ôV (1+ GoV). 
This helps, e.g., with the scattering of charged particles, because the Coulomb poten- 
tial has too long a range for a simple scattering theory, but also for resonances, where 
the coupling of the scattering states to bound states becomes important. 


5.3 Many-Body Systems 


5.3.1 One- and Many-Body States 


Since generally n is taken as the occupation number for many-particle problems, we 
shall now write |v) to indicate a one-particle basis, instead of |n) as used so far. We 
start from a complete orthonormal set of one-particle states |v), whence 


Yo wols and (vlv’) = dy. 


For continuous quantum numbers v, there will be an integral here instead of the sum, 
and the delta function instead of the Kronecker symbol. Here we order the states |v) 
with respect to their energy. This is not actually important for the time being, but 
the notation v < v’ should always make sense, and later it is mainly states with low 
energy that will be occupied. 

N particles have N times as many degrees of freedom as a single particle, and the 
Hilbert space has correspondingly more quantum numbers and dimensions. As long 
as they do not interact with each other, for each individual particle, we can identify 
the one-particle state it is in—if there are pure states, the case to which we restrict 
ourselves here. Let the first particle be in the state |v), the second in |v2), and so on. 
Then we may consider a product of one-particle states 


434 5 Quantum Mechanics II 


|vjv2...UN) = W) 8 |v2) @--- @ | vy) 


for the corresponding N -particle state. 

One basic assumption in the following is now that these NV -particle states always 
form a complete and orthonormalized basis, even if the particles interact with each 
other. Then any possible N -particle state |N ...) may be built from these states: 


IN...) = $O [ures vw)(vr ee bw.) , 


V1... VN 


since the states |v; . . . vy) form a complete basis, i.e., 


XO miv). wl = l, 


V1... VN 


and are orthonormalized, i.e., 


(vy... vyv vN) = (ilv) e (olv) 


Here we shall also allow for improper Hilbert vectors, where integrals occur instead 
of sums. 

This framework is generally unnecessary, however, for identical particles. For 
indistinguishable particles, we cannot state which is the first or which is the last, 
because the interchange of two particles does not change the expectation value of 
an arbitrary observable—otherwise the particles were not identical. Since we shall 
now occupy ourselves with such indistinguishable particles, it is clear that we should 
only have superpositions of states with an exchange symmetry: if the order of the 
particles changes, at most the phase factor of the states can change. 


5.3.2 Exchange Symmetry 


Let the transposition operator Ay, = Yi, exchange the particles labelled k and l: 


Pry |i. Vk... VeA Sloe. Vn Ven) - 


Since Zy% leads back to the old state, the operator Ay has the eigenvalues +1. 
Its eigenstates for the particles k and / are said to be symmetric (pı = +1) or anti- 
symmetric (py = —1). Let us now consider all N! different permutations Y of an 
N-particle state |v; ... vy). They can be built from products of pair-exchange opera- 
tors Zx, although not uniquely. The only thing that is fixed is whether an even or an 
odd number of pair exchanges is necessary. We speak of even and odd permutations 
(see Fig. 5.9). 


5.3 Many-Body Systems 435 


= OX x x FX XX 


Fig. 5.9 The 3! = 6 different permutations of three objects. The even permutations are the identity, 
the cyclic, and the anti-cyclic permutations, the odd ones are the three transpositions. The last shows 
three transpositions, even though it can also be understood as a single transposition with particle 2 
remaining unchanged 


Fig. 5.10 Representation of 
Pym Pim Pkm PY = 1. As 
Pim = 1, it is clear that 

Pim = Pix for all k # l Am 


For identical particles, the eigenvalues py; have to be either all +1 or all —1, 
because the exchange symmetry is a characteristic of the considered particles: they 
form either symmetric or antisymmetric states. The state cannot have one exchange 
symmetry in the pairs (k,l) and (k, m), but the other in the pair (l, m), as Fig. 
5.10 shows. Therefore we may restrict ourselves to either completely symmetric or 
completely antisymmetric states. 

In the following, we label symmetric states with an s on the Dirac symbol and 
anti-symmetric ones with an a: 


J... Vee Yee Sg Stl... yp... ve...)s,  forallk and /, 
J... Vee Vea = — |... V eVe- Ja, forall k and J, 
or, with 6g% = +1 for even permutations and 6% = —1 for odd, 
P |v... VN}s = |vi ... VN}s > 


P |v... VN}a = by |Vp...UN)a - 


Symmetric states describe bosons, and antisymmetric ones fermions. 

Hence, two fermions cannot occupy the same one-particle state, because upon 
transposition of the two particles, the many-body state has to change sign. We now 
have the basic ingredient for the famous Pauli exclusion principle. For symmetric 
states (bosons), this restriction does not exist. If n, gives the particle number in the 
state |v), then for bosons, n, € {0, 1, 2,...} holds, while for fermions n, € {0, 1}. 
The sum of all occupation numbers n, yields the total number N of particles, viz., 


N=}. 
The permutation operators Y all have an inverse: 


Oo? Nae 


In addition, nothing changes if all bra- and all ket-vectors undergo the same permu- 
tation, 


436 5 Quantum Mechanics II 
PPS). = Megs. 


Permutation operators are thus unitary. 
All observables O of an N-particle system have to commute with permutations, 
as long as we are dealing with identical particles: 


0=F0OPF => [0, ?Z]=0. 
Therefore, no perturbation can alter the symmetry: O = Fi, PyO= Pi O Py 
delivers (vj... vy| O |v{ ... Vy )a = —s (v1 -- . vy| O |v}... Vy)a = 0. In particular, 
symmetric and antisymmetric states are orthogonal to each other, which follows by 


inserting O = 1, and the symmetry does not change with time, because the Hamilton 
operator is invariant under permutations. 


5.3.3. Symmetric and Antisymmetric States 


In order to form arbitrary many-body states |v; ... vy) from symmetric and antisym- 
metric states, we take the symmetrizing and anti-symmetrizing operators 


1 1 
g 2 and n ae 


Here the sums run over all N! different permutations. The two expressions can be 
proven together. For this we set 


1 . A=.%, (PY) =1 forbosons, 
AS N! D Sa Riik la = Á, (P) = & for fermions. 


In particular, with 
Paf |v ... Vy) = F |v... Vy} ; Py A |v... Vy) = —A\v,... Vy) $ 
we find Ay A = A(Yyz) A and therefore also 
PR=EMA)A=HANAP. 
It remains to show that A is idempotent, to be sure that it is a projection operator. 
But now N! A? = Vig MP) PA = } g à (P) A holds and + yg 1 = N!, so we 
do indeed have A? = A. In addition, A is Hermitian, because F is unitary, A(Y) = 


4(P-"), and the sum over all Y is equal to the sum over all A~!: 


A=N=A', for A= Z adA= g. 


5.3 Many-Body Systems 437 


The operator A is a linear combination of the unitary operators Y, and hence itself not 
unitary. Furthermore, although we have already found the projection operators .Y and 
Á , we must nevertheless also normalize the unknown symmetric and antisymmetric 
states correctly. If n, gives the number of bosons in the one-particle state |v), then 
we have 
T 
|v... UN)s = = F |v... VN}, 
Ny-Ng.... 


jv... Vya = VN! AÁ |v... VN}. 
For fermions, the last equation with A‘ A = A delivers 


a(Vy...UN|Vp...VUv)a = N! (vy... vy| Z |v... vy) 


= ôa (v... v| Ply... vy). 
P 


Buthere only Y = 1 contributes—with (v; . . . vy |v"... vy’) = (vy | v1’)... (vy |v’) 
and because for fermions all v; have to be different. Thus |v, ...v,), is normal- 
ized correctly. In contrast, in the expression (v; ...vy| A |v, ... vy) for bosons, the 
nı!m!... terms contribute a 1, for which Z |v; ... vy) is equal to |v; ... vy). This 
implies 

1 1 
= Ti aaae al 


[vy + UN)s 


1 
Ivi... Bary = m or 


where both sums run over all N! permutations. The first sum has nı! n2!... equal 
terms and can be summed up correspondingly: 


nln)... ; 
Ivi- vns = ——— OP |v... Vy}, 
N! — 


if we take only the permutations Y’ which lead to different states. 
To compute matrix elements with AOA = AO = OA, it is sufficient to sym- 
metrize only in the bra- or ket-vectors. But we then have to normalize correctly: 


t / N! / F 
swi... vy| O |v ... Vy s = ——__— M. Pw] O [v ..UN )s, 
Ny.ng.... 


a(Yy...VylO |v"... Vy a = VN! (vy... vyl O |v... Vya. 


Note that the completeness relation for the N-fermion system (hence considering 
only antisymmetric states) with 


438 5 Quantum Mechanics II 
/ / 1 / / 
Vp... Vy Ja = — > |Vp..- Vu )aa(Vp... VN|VI ... VN Ja 


= om |v... Vy )aa(Y,... Uylvy’... WN’ )a 


may be written in two ways, viZ., 


ye |v)... vyda.. Vwl=N!, 
V1... VN 


or with far fewer terms 


XO [vi Pw)aa(vi Nl. 


Li D aaa h i 


In the real-space representation, the N-fermion state |v; ... vy), reads 


1 
(r,...tylvp...¥v)a = ie edn le cas VN) 


(ri[vi) .-. (tw lv) 


VN! 


(rilon) si trv lvy) 


The last expression is called the (normalized) Slater determinant, 

Calculations with symmetric or antisymmetric states can be greatly simplified 
with creation and annihilation operators: the former increase the particle number by 
one, while the latter lower them by one. Therefore, in the following we have the 
particle vacuum |0), one-particle states |v), and N-particle states |v; ... vy), and 
|v)... VN)a for N > 2. The set of their Hilbert spaces forms the Fock space. Here 
states with different particle number N are orthogonal to each other. The Hilbert 
vector |0) of the vacuum state should not be confused with the zero vector |o}, where 
(0|0) = 1, but (ojo) = 0. 

We begin with fermions, because the states are easier to normalize than those of 
bosons. 


5.3.4 Creation and Annihilation Operators for Fermions 


Let the operator AÏ create a fermion in the one-particle state |v) from the vacuum 
|0): 


AÌ10) = |v) . 


5.3 Many-Body Systems 439 


Let generally Ai make the state |v)... vyv), from the N-fermion state |v, ... vy)a, 
if this is possible at all—the state |v) has to be unoccupied previously, so that an 
antisymmetric (N + 1)-particle state can be constructed: 


|o) ifv € {vi ... vy}, 


; = 
Alba. = [I ay, if v ¢ {vi ... vy}. 


Note the phase convention employed here: if v is arranged differently, the state may 
differ in sign. For example, |vv; . . . vy), requires another creation operator. This will 
be discussed separately on p. 442. It follows that 


— at t 
|v} iets Vy )a = A, as Ai, 10) a 
and the antisymmetry requires 


At At, =—A‘, At, in particular (AÌ)? = 0 (Pauli principle) . 


vp? 


States with different particle number should be orthogonal to each other. Therefore, 
(OAS = (o] and = ("At = (v'|v) (0. 


For the operator A, Hermitian conjugate to A’, it follows that 


v? 


AAy = —AyA, , a(Yy... Vy] = (O|A,, "A 


vy > 
together with 

A,|0) = |o), and A,|v’) = |O){v|v’) . 
We thus have 


vp... Vy—1)a ifv = vy , 
—|vi .. . VN—2Vy}a ifv = vy-1 , 


A|vi .-- VN)a = 


(=)! v2... vw )a ifv =v, , 
|o) ifv ¢ {vi ... vy}. 


The operator A, thus removes a fermion from the state |v). Therefore, A, is an 
annihilation operator of the fermion in the state |v). 

With these creation and annihilation operators, many-fermion states can be treated 
very conveniently—also if the particle number does not change at all, e.g., if equally 
many creation and annihilation operators occur in operator products. In particular, 
there is no longer any need to pay attention to the antisymmetry. We merely follow 
the calculation rules for the new operators: 


440 5 Quantum Mechanics II 


AA}, +AlAt = 0 = AyAy +AyAy, 

A,A}, + ALA, = wv’). 
The first two commutation laws have already been proven. In addition, we have 
(0|A,A‘, 10) = (v|v’} and (0JAŻ, A,10) = 0. For more particles, we first consider the 
case v Æ v’: AA and A‘,A, create one fermion in the state |v’) and destroy one in the 
state |v), but the new states have opposite sign, e.g., Al Ags changes from the state 
|v, ...VN—1)a to the state |v; ... vy_ovy)a, but Ay, At results in Ay, ,|Vi---Vv)a = 


VN 
—|vı .. . Vy—2Vy }a. Therefore, it only remains to show that A,A} + A‘A, =1: 


+ _ J lo) ifv € {v1 ... vy}, 
AAs Phe VN)a = t vya ifv g {vi... vN}, 
# _ Jiu... Vy}a ifv € fy)... vy}, 
A Aol Bins) = ie ifv ¢ {vi ... vy}. 


With this our claim is proven: simple anti-commutation relations are valid for fermion 
field operators. 
An additional result is that the eigenvalue of 


N, = AŻA, 


gives the occupation number of the state |v), and hence that N, is the occupation 
number operator. For the particle number operator, we must therefore take 


N=) AA. 


Its eigenvalue gives the total number of fermions. 


5.3.5 Creation and Annihilation Operators for Bosons 


So far we have associated each particle with a one-particle state. For bosons, such 
a state may be repeated very often, because many of them can be in the same one- 
particle state. As already shown on p. 437, the occupation numbers n, are important 
for the normalization. A particularly suitable representation for bosons is the so-called 
occupation-number representation. We fix an order for the one-particle states |v) 
and then give only the occupation numbers n,, thus writing |n; . .. ny ...} with N = 
>>, n. If unoccupied states also occur, we may leave those out of the representation. 
Therefore, we order the one-particle states with respect to their energy (see p. 433). 
We consider here only the (symmetric) boson states |...,...)s. 


5.3 Many-Body Systems 441 
The boson creation operators Bi with the property 
Bil ...ny.. eg =|... tl...) c+), 


whose normalization factors c(m,+1) remain unknown for the moment, have to 
commute with each other so that a symmetric state is produced: 


tpt tT pt 
BiB, — BB) =0. 
For the Hermitian adjoint annihilation operator, we have 


ByBy —ByB, =0. 


With,(...ny...|By|...ny’...)* = 5(...my’... |Bi | Le My. Js = On nti Cm+)), 
we may infer 


B,|...n,'..J,=|...m,'-1...),c%(m,’) . 
Hence c(0) = 0 holds, because no particles can be destroyed in the vacuum: 
B,|0) = |o} . 
For v Æ v’, it follows that B,Bi,| EEEE ES T gi = BB, -Ny Ny .«.)s, SO 
both are equal to |...m,—1...ny+1...),c*(m,) c(ny +1), in contrast to the case 


for v = v’, where 


ByB |... ny. s = |...my..)s lCm+D/? , 
BBs ng ðs = |. -ny s LEG 


If we choose therefore c(m,) = ./n,, so that |c(m, +1) 7 — |cy K = 1, then for boson 
field operators, the following simple commutator relations are valid: 


B,Bi, — BiB, = (v|v’) , 
BİBÝ, — BBY = 0 = B,By — ByB, . 


In addition then, 


is the occupation-number operator for the one-particle state |v) and 


N= XC BiB, 


442 5 Quantum Mechanics II 


is the operator for the total number of bosons. We also have the equation 


_ By" Bye 
Jn! s/m! 
The field operators in Sect. 4.2.8 have the same properties—there we sought operators 


N whose eigenvalues were the natural numbers n. And the ladder operators for 
harmonic oscillations in Sect. 4.5.4 also had those properties. 


- (0). 


[nim ---)s 


5.3.6 General Properties of Creation and Annihilation 
Operators 


We now summarize the previous considerations. Except for the all important sign 
in the commutation relations, the creation and annihilation operators for bosons and 
fermions are very similar. Therefore, we now write, with the upper sign for bosons 
and the lower sign for fermions, 


[Y Yi] = w) and [W,, Wy] 


v? vit * 


With these field operators, the many-body states |n; ...), and |n ...), can be created 
from the vacuum state |0): 


For fermions, the occupation numbers n, are only equal to zero or one, because 
for them (W,)? = —(W,)? vanishes, and hence n,! = 1. In addition, with E (i) = 
ia Me We have 


Yi | eas Thpess)ea = (+)2=0 |...mj+1 . + sa WAl xr i , 
W lect -Ysa = CE |.. ni1.. Ysa xf a 


The other phase convention © (i) = = i, already mentioned in Sect. 5.3.4, is 
often used. It seems simple, because then k only runs over a finite number of values, 
but it is less convenient because the states of higher energy are all unoccupied and 
usually only the states near the Fermi energy are important. 

For a change of representation, viz., 


IH) = Yo lv)(vle) , 


5.3 Many-Body Systems 443 


new creation and annihilation operators are necessary. From 


Iu) = 10) = LM 10) (v|u) , 
it follows generally that 


v= uy = w=} ul) Wy. 


v v 


For example, we can go over to the real-space representation. Let WÌ(r ) create a 
particle at the position r and Y (r ) destroy one there. Then we have 


wor) = ory vi > or) = ory) &, 


v v 


with (r |v) = y, (r ). Conversely, 
vi = [ar (vir)* Yr) = w= far (vir) Yr). 


The commutation laws are transferred: 


Wus Yele = Jotea Iv’)? [Wo Vily =} (uvole) = (ule) , 
and correspondingly [Y,, Yw ]y = 0. The particle number operator reads 


N=) viv = Ef) rdr’ wry wlr Wir) Wor’) 


= Jf drdr’ (r |r’) wiry we= [ar Wir) Wr). 


Thus YÏ (r ) W(r) is the particle density operator. Note that the expectation value of 
the particle-number operator wry, does not need to be an integer: if it is so in the 
basis {|v)}, then it will not generally be so in the basis {|u)}. 


5.3.7 The Two-Body System as an Example 


Here there are only the permutations |v; v2) and |v2v4): 


|vi v2) + |v2v1) |v) v2) — |v2v1) 


ee d 
Mens oe ae 


|v V2) 5 = 


444 5 Quantum Mechanics II 
For the matrix elements of an arbitrary operator O and with 
(v1 ¥2|O|vy'v2') = (v2v1|O]v2'v1') , 


it follows generally that 


(v1 ¥2/O|vy'v2') £ (v1 ¥2|O|v2' v1’) 
JT + (vilo) VI ++ (wiv) 


s,a(V1V2/O|v1'V2") sa = 


We shall only be concerned with one-particle operators T and two-body operators 
V. Here, for a one-particle operator, 


(vp V2|T |vy'v2") = (vy, |T | yy’) wln) + (vi ly’) (v2|T v2’) . 


Its bra- and ket-states are thus distinguished at most in one particle, otherwise the 
matrix element vanishes. (For a two-body operator, two states are distinguished in at 
most two particles.) For fermions, we must also have v; Æ v2 and vy’ Æ v2’, otherwise 
the parts cancel each other. 

The last equation also follows if we build T up from bilinear products of cre- 
ation and annihilation operators and take the matrix elements of T in the chosen 
representation as expansion coefficients: 


T=) (ITV) WW. 
We have in particular, 
(OY, Wy, BIW, wT we 10) 
= 2 1 
STF (viva) JT # 1129 


sa (vı AR AR By |v; V5) s,a 


and with this, the factor 


Ww, vw = Ww, ((v2|v) T Pip, ) = (vlv) Py, <= ((vi|v) T Pip, ) Ww, ’ 


and also the other factor wy wl, yi, in the expectation value of the adjoint of 
2 1 
Wy WW, With (O,wt = (o| and Y,|0) = |o), together with 


(0Y, W110) = (ulu’) , 


we thus find 


(v'[vo")(vifvr’) E (vivi) l’) 
V1 (vi fv2)/T + (viv) 
(vivi) (vlv) E (vv) (vlv) 

V1 (vi fv2) JT + (wiv) 


savval Yi Py v v )sa = (v2/v) 


+ (vilV) 


5.3 Many-Body Systems 445 


which is what was to be shown. 
If we now consider a two-body operator, e.g., the interaction V (r1, r2) = V (r2, r1), 
then 


(viva| V [vi v") E (vi v2|V v2’ v1’) 
VI milvo) vI + (01102) 


sa (vı v2| V |v vo") 5a = 


With u and u’ from the same basis as v and v’, we may also write 


V=} $ Wp) Ue wy, 


vv’! 


because if we use 


WW Wi) = WY, (Clu) UW) Uf 
= (vlu) ((vi|v) + PIY, ) + (vi fu) YEW) (l) Yi, ) 


in the previous equation and its adjoint for VN ee along with (O|W* = (o| 
and Y|0) = |o), then it follows that 


(viv2|V [vi v) + (vvi Vvv") 
2ST + (viji) V1 + (vv) 
(vvl V |v vi") + (vvil V [vi'n 
2ST + (vi [v2)/1 + (v) 


sa(V1V2| V |v v2') sa = 


and hence, with (vu|V |v’ u’) = (uv|V |u v’), the original equation. 

The expectation value of the symmetry-independent first terms (vı v2|V |v1 v2) is 
called the direct term while that of the symmetry-dependent term (v,v2|V|v2v;) is 
known as the exchange term. The expansion of one- and two-body operators with 
respect to products of creation and annihilation operators turns out to be useful for 
all N-particle states, as we shall now show. 


5.3.8 Representation of One-Particle Operators 


One-particle operators such as the kinetic energy and the one-particle potential act 
on the degrees of freedom of only one particle. Clearly, 


T=) WT) vy, 


vv’ 


where Yt Yy is the one-particle density operator. In the occupation-number repre- 
sentation, it has the matrix elements 


446 5 Quantum Mechanics II 


Iv) v) I) 


|v’) Iv’) lu’) 


Fig. 5.11 The two Feynman diagrams are to be read from bottom to top, i.e., from initial to final 
state(s). One-particle operators (T) act on one particle, two-body operators (V ) on two. Uninvolved 
partners do not change their state and would appear here as simple straight arrows. Such diagrams 
are useful for compositions of operations, similar to those in Figs. 5.7 (right) and 5.8 


ga Nj... RARA sei ni'n; ae Jea 


Eli- E(j 
= CE) 20 Snin) sal -ni — 1nj... |. nf nf — 1.. Ysa. 


We have in particular, 


N 
sa(Viv2... vN|Tlvi V2... Yw)sa = X (YalT lyn) , 


n=1 


sa(VpV2...VylT|Yy'v2 «2. Vy)sa = (VT ]vy') ymn, for vy A vy, 
sa(Vıv2 ... vy|T|vi v ...Vy}sa =O, forvjand v ¢ {v1 v2}. 


A one-particle operator can change the quantum numbers of at most one particle. 
Therefore, its matrix elements do not depend on the symmetry of the many-body 
state. Its expectation value is the sum of the expectation values of all occupied one- 
particle states (see Fig. 5.11 left). 


5.3.9 Representation of Two-Body Operators 


Two-body operators like the interaction V between two particles can alter the quan- 
tum numbers of two particles. Here we have 


V=} Y (wu Viw) A wy . 
S 
The expression 
v= © I 
, V1 + IE) VT + wiu) 


vu, v <u 


sa (vul Vive’) sa PIPP Yy 


5.3 Many-Body Systems 447 


is equivalent, where, as in Sect. 5.3.7, 


(vu| Vivu’) = (vu|V wv’) 
VIF wI F vw 


With the commutation behavior of the annihilation operators, we obtain 


sa (Vel Viv) sa = 


(wu Viv u) E(u V wv Ew wy wy 
= (vu|V |w v) (vul Vv u) BEY wy , 


and in the sum v’ < yz’, we may swap these two indices (with w’ < v’) without any 
consequences. Instead of the claimed equation, we then have 


va|V |v + (vu| Vlw v ao 
vol > (vel V| a (val Viu ? wiwtwy ty 
v<p,v' u’ v|u ) 
= 4D Waly) + (val V wv) ULB Why 
vu, v u’ 
= 1 > (vu|V vw’) WIS Wy Wy +O, Vy) , 
vv! pe! 


where the summation indices have been renamed at the end. As above, the upper 
sign holds for bosons, the lower one for fermions. Therefore, Yy Yy = Vy Wy. 
Thus the two expressions are indeed equal. 

The expectation value of a two-body operator consists of the direct and exchange 
terms. It depends on the symmetry of the many-body state and, according to 
Sect. 5.3.6, this yields 


sal. [WEWT Wy Wl -Ysa 


= Se nin; (vilo) (jl) E (vlv (vile) (wlw (ely) 
iAj 
+ XO ni(ni—1) (vilv) (vilu) (v lvi (ulvi) « 


i 


Then quite generally, 


salto. [Vni -Jsa = £ >) rin (vil Vl ving) & (viv; V viv) 
ifj 


+4) 0 ni(nji—1) (wivi V vivi) « 


For fermions, the second sum does not contribute—none of the states is doubly 
occupied. Therefore, the result may also be reformulated as 


448 5 Quantum Mechanics II 


Vn Vm V [Vn Vm) £ (Va Vm V|VmnYn) 


D ( 
1 
sa(Vı VV |v + UN) s,a = 5 


oat 1+ (Val Vm) 
N 
1 
=%5 XO sa (VaVm| V [Vn Vm)s,a ; 
nm 


where we also use p. 444. For the expectation value, it thus follows that 


N 
sa(vı tee vy lV |v tee VN) s,a = Yalan V [Vn Vm)s,a . 


n<m 


Apart from this, we must consider the off-diagonal matrix elements of V . A two-body 
operator can alter the quantum numbers of at most two particles (see Fig. 5.11 right). 
For vı Æ vı’, it follows that (compare with the result for one-particle operators in the 
last section) 


sa(V1V2 ase vy |V|vy'v2 tee DN) sa 


à ; ny ny! 
= Ye saval V [vi Vn) sa , 
1+ (vi lva) 1+ (valvi) 


n=2 


and for vı and v2 ¢ {v1 v7}, 


s,a (V1 V203 tee vy |V |v v v3 tee Vy) s,a 


ny (m= (vi|v2)) m’ (M2! —(Vy'|v2’)) 
1 + (vj|v2) 1+ (vilV) 


= sa(V1V2| vinha 


As before the particle numbers n; refer to the bra-vector and the particle numbers 
ni’ to the ket-vector. If the two vectors differ in more than two particles, the matrix 
element is zero. 


5.3.10 Time Dependence 


So far all our considerations of many-body systems are valid for a fixed time. Now we 
ask how the creation and annihilation operators behave under exchange at different 
times. We assume that the time-dependence is determined by a Hermitian Hamilton 
operator consisting only of one- and two-body operators, which neither depends on 
time explicitly nor changes the particle number: 


H= SCT) Wy +5 XO wulu) Wey we. 


vo! vuv'u' 


5.3 Many-Body Systems 449 


Here, in addition to the kinetic energy, T may also include other one-particle opera- 
tors. We shall return to this shortly. 

The Schrödinger picture is less suitable for field operators than the Heisenberg 
picture (and the Dirac picture), because in field theory, we also trace the states back 
to operators, and acting on the vacuum, which does not depend on time. Therefore, 
we now transfer the relation Ay(t) = UÝ (t) Ay (0) U (t) with U(t) = exp(—iHt/h) 
from Sect. 4.4.2 to field theory (without reference to the Heisenberg picture): 


W(t) = U'(t) VO) Ut) Ss WM= OUA. 


We thus take over the equations valid for observables and apply them to field opera- 
tors. Using U = —iHU/ħ = —iUH /ħ, this yields 


dw i IH, Y] dwt i 1H Wt} 
d h” d he l 
With [AB, C] = A[B, C]z — [C, A]+B, we obtain [Yi Yy, Ye] = —(k|v) Yw, and 
also — 
[Pipi We Ye) = YET, Ye] Wy We, 
with 


[Wi Y}, We] = Flv) t} — (uti. 
For bosons and also for fermions, this leads to 


(WI, Ye] = (klv) WY, 
WALEY, Y] = lelu) YE By Bue = (lv) BEY or 


Therefore, for both sorts of particles, the Heisenberg equation with the chosen Hamil- 
ton operator can be reformulated as 


ih 


dw, Paik + 
g ERATI) Wo + Y eV w) Yi 


wa 
if we use (vu|V |v’ uw’) = (uv|V |w v’). E 

Later on we shall introduce the average two-body interaction V, a one-particle 
operator which can be combined with T to give Ho = T+ V. Subtracting from V, this 
yields the residual interaction V — V as a two-body operator, which is often small 


and unimportant. If we neglect it and take the one-particle basis which diagonalizes 
Ho, we obtain 


oW(t,r) 
E x (r|Aolr) Y,r). 


„dY, . 
ih a = (v|Holv) VW, eg., ih 


450 5 Quantum Mechanics II 


This is similar to the usual Schrödinger equation. However, W, is not a state, but 
an operator, and a matrix element of the Hamilton operator Ho is to be taken. We 
discuss this further in the next section. 

Before that we shall set (v|Ho|v) = ha,, whence 


Y, (t) = Y, (0) exp(—iw,t) and = Wi(t) = WiC) exp(tia,f) . 


In the last paragraph the two-body interaction was neglected. There the equations are 
valid (without neglecting this) in the Dirac picture, which includes the time depen- 
dence of the states due to V. So far Ho should be sufficiently simple for mathematical 
treatment, but now we distinguish between Hp and V through physical properties, 
namely, whether one- or two-body operators are involved. 


5.3.11 Wave-—Particle Dualism 


In this chapter we have always started from (several) particles, but we would also 
have arrived at creation and annihilation operators if we had quantized the wave 
picture, i.e., if we had taken each field strength as a Hermitian operator (e.g., the 
electromagnetic field, as will be shown in Sect. 5.5). In fact, such a field quantization 
would have taken us beyond the usual scope of a course on quantum mechanics, 
but otherwise would also have had many advantages. Note that the term second 
quantization instead of field quantization is misleading: we quantize either once in 
the field picture or once in the particle picture, with several particles. 

So far we have investigated the laws governing the behavior of particles and looked 
at them as representatives of aclass of identical particles. For single particles, there are 
only statements about probabilities. Therefore, we always take a very large number 
N of equal particles and use them to repeat the same experiment. The more often the 
same particle attribute appears, the higher its probability. But this probability now 
shows interference effects and therefore requires a wave theory. We assume for these 
considerations that the particles do not act on each other, which we also had to do 
when deriving the generalized Schrodinger equation in the last section. 

For N > 1, it is of no importance whether N is a natural number: even a (small) 
uncertainty in the particle number might occur. It is just then that a sharp probability 
statement (appropriate for the wave picture) holds for single particles! On the other 
hand, if an uncertainty in the wave quantities (phase) is not important, then sharp 
statements in the particle picture are possible. This has already been pointed out for 
the uncertainty relation between particle number and phase (Sect. 4.2.9). 

Considering the relative frequency, it is easily overlooked how the particle picture 
is contained in such a seemingly unimportant constraint: N had to be a natural 
number—other values were meaningless. This granularity is foreign to the wave 
picture. Field intensities and wave functions are appropriate for there, but classically 
these distributions can be arbitrarily normalized. Clearly, the wave picture then has to 
be modified in such a way that the arbitrary values for N become restricted to natural 


5.3 Many-Body Systems 451 


numbers by quantum conditions—and the observables of the wave picture (field 
strengths and intensities) have to become operators. We have become acquainted 
with the commutation laws for the field operators in this chapter. 

As long as wave functions are taken as classical field quantities, not as probability 
amplitudes normalized to 1, the Schrödinger equation is not an equation of quantum 
mechanics, but of classical physics. 


5.3.12 Summary: Many-Body Systems 


In the quantum mechanics of many-particle problems, bosons and fermions behave 
differently: bosons form symmetric states and fermions antisymmetric states. Such 
states with special exchange symmetry are easily treated by introducing creation and 
annihilation operators Wt and W, which satisfy the commutation laws (upper sign 
for bosons, lower for fermions): 


Ww p viy, =’) and Wy Fw, =0. 


Both are called field operators. They also arise when quantizing the wave pic- 
ture. Here it is best to work in the Heisenberg or Dirac picture, where Y,(t) = 
Y, (0) exp(—io,t). 

It is also important to distinguish between one- and two-body operators: 


T= SOIT) Yyy and V=4 SO (wuu) Ww wy wy . 


vv! vuv' 


Here T can also contain “average one-particle potentials”. 


5.4 Fermions 


5.4.1 Fermi Gas in the Ground State 


As a first application, we shall evaluate the one- and two-body densities 


pr) =a... rw] Br) Wr) [Vi --- VN )a » 
pE, r’) =a. vy] BI) I) Or) Br’) |v. wwe, 


of a Fermi gas in the ground state, i.e., the probability density for a particle at the 
position r or for two particles at the positions r and r’. Actually, in a Fermi gas, the 
individual fermions do not interact with each other, but the antisymmetry nevertheless 
correlates them. Since only one fermion may be in any one-particle state, in the ground 


452 5 Quantum Mechanics II 


state of the Fermi gas, the particles are distributed over the various states with as low 
a total energy as possible. 

The following calculation would be more complicated if we enclosed the fermions 
inacube of volume a’, as in Sect. 4.5.3. Itis easier with periodic boundary conditions, 
i.e., with W(x, y, z) = Y(x+a, y, z) = Y (x, y+a, z) = W(x, y, z+a). They lead to 
the eigenfunctions 


exp(ik, -r) 
VV 


if we leave out the spin functions x,(s ). The wave vector k, and the energy EF, are 
then determined by the constraints 


W (r) = 


27 (hk)? 
k, = —n and E,= 
a 2m 


’ 


where each Cartesian component of n, has to be an integer (0, +1, ...). With the 
cube and impenetrable walls (see p. 355), each component of n, takes only the values 
1, 2,3, .... However, then for k, œ n,, there is a factor 7 /a instead of 27 /a. More- 
over, for periodic boundary conditions, all states apart from those with a vanishing 
n, component are more strongly degenerate by the factor 27 = 8, but lie further apart 
from each other by the same factor than for the cube. The density of states is the 
same in both cases. But the boundary conditions are not important for our problem. 
In particular, as in Sect. 4.5.3, for the number of states with E < Ep, a factor of 2 
accounts for the two spin states of spin-1/2 particles and we have 


V 3 
N 7x 2 — kr. 
6m2 


With the above-mentioned wave functions, we shall now calculate the one- and 
two-particle densities o(r) and o(r, r’). Generally, we have 


N 
(alvi ee dw] VIE) WL) vi... vwa = Do aE), 


n=1 
and using | xn(s)|? = 1 = |xm(s)|?, 


avi.. un| WI’) Wr) Wr) Wr) vi... vw )a 
N 


=} YO (a? me? = Vn ©) Win E) Yin) Win © Sn Sn) 7 


nym=1 


HUn Wine? = Win) Wr Œ Yin) Win) (SnlSm) 7} 


5.4 Fermions 453 


Here we see immediately the advantage of periodic boundary conditions: the one- 
particle density is then constant and given by 


ei ke 

r) = — = — = s 
p Vo 32 Po 
and the two-particle density simplifies to 


1 N 
pir’) = a XO {1 = Sexpfi (ky — Km) © (Œ — r )} [(Spl8m) 7 


nym=1 


—5 expfi (Ky — Km) f= rY} [(Snl8m) 7 - 


The double sum can be approximated by a double integral, integrating over dîn = 
2V /(27)? dk and including a factor of 2 for spin-1/2 particles. For the latter, we 
have on average |(s,,|S,,)|> = 1/2. Using this, the double integral factorizes: 


N? me ea 


2 
p(r, rn’) © v2 2V2 \ (2m)? for exp{ik - œ- r’) . 


Here we have to integrate over all directions of k and the modulus from 0 to kp = 
(3207 p9)!3. According to p. 409, we obtain f dQ; exp(ik - a) = 47 sin ka/(ka), and 
consequently, using F1 (x) = x7! sin x — cos x from p. 400, 


3F (x) 


x2 


pmr) = po? {1 - 3( | with x = kplr — r'| . 
Of course, the factor of 1/2 comes from the two spin states. 

The antisymmetry thus correlates fermions of equal spin. They tend to avoid each 
other, each fermion being surrounded by an exchange hole (see Fig. 5.12). This 
anti-correlation shows up only for short distances. 

Therefore, the boundary conditions are not important for this consideration, and 
the function (3F; /x?)? may be approximated by a Gauss function. 


Fig. 5.12 Exchange hole o(F, F’) / 007 
around a fermion. Two-body E = 
density as a function of 

x = kp|r — r'| (continuous 
red) and the approximation 
1— 5 exp(—x7/5) (dashed 
blue) 


454 5 Quantum Mechanics II 


5.4.2 Hartree-Fock Equations 


Each N-fermion state can be expanded in the basis {|v; ... vy }a}. We shall use this 
freedom in the choice of one-particle states {|v)} to diagonalize the Hamilton operator 


H= X wHo’) win, +4 2 (wul Vivu) Bw wy 


vv! ve! ul 


as well as possible with a single state |v ...vy)a. Here, in addition to the kinetic 
energy, Ho also contains the potential energy which originates from external forces, 
while V describes the coupling of the single fermions among each other. 

The diagonal elements of the Hamilton operator H = Hp + V just mentioned, 
viz., 


N N 
alvi- UylH|v1 .-. ¥w)a = > (WalHolva) + X pnvmlV [YnYmnda » 
n=1 n<m 


supply the energy eigenvalues in zeroth order perturbation theory. Of the remaining 
matrix elements ,(v;...vy|H|vj'... vya, all of those whose bra- and ket-states 
differ in more than two particles will in fact vanish, but generally neither 


a(Y{V2V3... vy|H| vi vv... vy)a = alvi v2|V |v v)a , with {v1, v2} ¢ {v1', v2}, 


nor the matrix elements which are not diagonal with respect to just one particle will 
vanish. For example, 


N 
a(vivz vyl Hvi va.. Uv) = (vil Holvi’) + $ afvi val V [vi vna, 


n=1 


if vı Æ vı’. At least this second kind of non-diagonal element vanishes if we deter- 
mine the basis {|v)} (one-particle states) from the Hartree-Fock equations: 


N 
(v|Holv’) + >) avrnl Viv Unda = ey wv’). 


n=1 


We shall thus derive the one-particle states |v) and the one-particle energies e, from 
the Hartree—Fock equations and take approximately |v; ... vy), for the N-fermion 
states. With this in fact generally the non-diagonal elements of H will not all vanish 
yet, but a better approximation can be obtained using just a superposition of several 
states—we shall come back to this in the next section. 

In the Hartree-Fock equations, the test particle is coupled to the remaining 
fermions. The sum of the one-particle energies e, of the occupied states counts this 
coupling twice, so this is not equal to the ground-state energy Eo of the N fermions: 


5.4 Fermions 455 


N N 
Eo = aly tee vy |H|vy tee Vy )a = X (nlHolvn) = 5 5 a (VnVml V Vanda 


n=1 n,m=1 
N N 
1 
= Xen -7 5 a(VnYVml V [Yanda 


But we have Koopman’s theorem: the last particle has the energy 
en = EX(N)— Eo(N —!) , 


since 
N-1 


en = (vy |Holvn) + È av nl V [vn vn)a - 


n=1 


We now consider the Hartree—Fock equations in the real-space representation, where 
we restrict ourselves to local Wigner forces. Then the following abbreviations are 
useful: 


N 
Va(r) = > / dr! y“) V(r, r) Yaa) (Hartree term), 
n=1 
wi 
Ve(r, r^) = 5 Pnr CN VE, r) Yr) (Fock term). 


n=l 


Note that only the states |v,) with the same spin orientation as the unknown solution 
(indicated by N’) contribute to the Fock term. For spin-independent operators Ho and 
V, we have 


(Ho + Vulr)) Wr) — J dr’ Vem, r) yE’) = yl) e, . 


Here the direct term (Hartree term) and the exchange term (Fock term) contain the 
wave functions to be determined. The Hartree-Fock equations can be solved only 
iteratively. We first use a suitable ansatz for Vy and Vp, solve the eigenvalue equation, 
and then use the eigenfunctions found in this way to get a better approximation for Vy 
and Vp, and so on. This method has to be repeated until the solutions of the Hartree— 
Fock equations do not change within given limits (until they are self-consistent). 

The exchange term is non-local and impedes the calculations. If we neglect it, 
we have the simple Hartree equations, but their solutions are not orthogonal to each 
other, because they belong to a wrong Hamilton operator [7]. The exchange hole is 
less effective for repulsive (Coulomb) forces than for attractive (nuclear) forces. The 
Hartree equations are therefore essentially more appropriate for atomic physics than 
for nucleus physics. 


456 5 Quantum Mechanics II 


5.4.3 Rest Interaction and Pair Force 


The Hartree—Fock method thus delivers the best one-particle states—it diagonalizes 
the Hamilton operator as well as possible with a single antisymmetric product of such 
states. However, there are still off-diagonal elements originating from the two-body 
coupling. In fact, only 


N 
Ayr = (oI Ho |v’) na X a(vvnl Viv'vn)a) yi Vy 


vv’ n=1 


becomes diagonalized—there remains a residual interaction 
Ve = H — Her . 


In order to include this term, we have to superpose several product states whose 
components differ from each other by the quantum numbers of at last two particles. 
This configuration mixture delivers a further correlation, in addition to the symmetry 
condition, which is related to the exchange hole. 

We thus ask which parts of the coupling V are already well approximated by a 
one-particle operator and which remain as the residual interaction. Clearly, the parts 
with longer range change only weakly with the distance from the remaining partners. 
These can be well described by an average one-particle potential. Consequently, the 
residual interaction describes the parts of short range. 

In order to study those effects, we could investigate the limit of a delta force 
œ 6(r — r’). But since the matrix elements 


(vel ô |v'u') = f Pr y*m) y) Ww) vw) 


very often differ from zero, the corresponding problem is still too involved. We take 
the so-called pair force, which, to exaggerate somewhat, has even shorter range: it 
acts only between fermions in mutually time-reversed states (and which are therefore 
equally probable everywhere). For a Hamilton operator with time-reversal symmetry, 
they have the same energy according to Kramers’ theorem (see p. 314). 

Thus as residual interaction we take 


Voair = a a(vd| V |v'd’), Ww," wi Wy Yy, 


vo! 


without summation over Y and V’. The two states |v) and |V ) have opposite momentum 
and angular momentum. Therefore, it is often assumed that v and v differ only in 
the sign (v = —v > 0) and we then require v, v’ > 0 for the sum. Since the matrix 
elements of the delta function are always positive, we shall also assume that the 


5.4 Fermions 457 


matrix elements of the pair force all have the same sign, which, for an attractive pair 
force, will be negative. 

For such a pair force, close to the Hartree-Fock ground state, it is particularly 
convenient for the energy if the fermion level is pairwise occupied or empty. If |v) 
is occupied, then so is |v). If the ground state according to the Hartree-Fock method 
(with even particle number) is of the form |v, V1... Vy /2VN/2}a, it now also contains 
(superposed) states which differ by pairs vv. These have neither momentum nor 
angular momentum. In excited states, these pairs can also break up. 


5.4.4 Quasi-Particles in the BCS Formalism 


Despite all the simplifications which result from the pair force (compared to the 
actually expected residual interaction), the eigenvalue problem is still too difficult. 
Bardeen, Cooper, and Schrieffer proposed an approximating ansatz for the ground 
state which allows the pair force to be diagonalized rather easily: 


IBCS) = [ [C + vs Yi ¥P)I0) , 


v>0 


where instead of u, and v, we could also take cos p, and sin p, (see p. 310): 
Up +w =1, mw=u*>0, w=v,". 


The occupation probabilities of the states |v) and |v) are thus equal and easy to 
remember. With probability u, they are unoccupied (empty), and with probability 
v2, they are occupied (filled). However, the ansatz has the disadvantage that the 


particle number is not sharp. In fact, we require the expectation value to deliver the 
correct particle number n, i.e., 


(BCSI N |BCS) = $` 2v? =n, 


v>0 


but the particle number is not sharp, as will be shown later: 


(AN)? = (BCS| N? |BCS) — (BCS| N |BCS)? = 4 > iy Vy 


v>0 


In fact, for most terms, we find either u,” = 0 or v,? = 0, and hence (AN)? < 
45, v,? = 2n, but this uncertainty is nevertheless irritating for the smaller particle 
numbers—hence particularly in atomic and nuclear physics, but less in solid-state 
physics. Clearly, in this approximation, we cannot describe any properties varying 
quickly with the particle number, only the slowly varying ones. 


458 5 Quantum Mechanics II 


The state |BCS) may be taken as a quasi-vacuum. It has neither momentum nor 
angular momentum, but energy. Moreover, its particle number is not zero. Acting 
on this quasi-vacuum are quasi-particle operators ®“, which again obey the Fermi 
exchange rule. Carrying out the Bogoliubov transformation, 


b, = m V, — v, YÌ => bi = u Yİ- v, W, 
the commutation rule for the operators Y implies 
D, Di +DL d, = (wv) and ®, 0,4+0,0,=0. 


Whether a particle is annihilated in the state |v) or created in the state |V ) makes no 
difference to the momentum and angular momentum—only for the particle number 
and the energy. For this reason, the Bogoliubov transformation is not as peculiar as 
it may appear at first sight. 

According to p. 314, for fermions, we have |D) = —|v). This yields 


S=HyV%4+yui <5 disu Wty, 
Y, =u, Py +v, OÌ and W= um Or- v D. 


v 


Now we may deduce that 


®, |BCS) = |o) and %,"|BCS)= WF [| Gy + vv Y} Y)0), 
v'(#v)>0 

as well as ®,, pi, |BCS) = |BCS) (v|v’). We can see that the particle number operator 

Yo Ui Ww + yi W;) is generally no longer diagonal by considering 


N= > 2v? + (u? — v (DI ©, + DI D) + 2uv (DI Di + Or ,). 


v>0 


With this result, we can also prove the above-mentioned expression for (AN). 


5.4.5 Hartree—Fock—Bogoliubov Equations 


Using the Bogoliubov transformation, we can go over from particle to quasi-particle 
operators and generalize the Hartree-Fock equations in such a way that, within the 
framework of the BCS ansatz, pair correlations are also included. Then the quasi- 
particle energies e, and the occupation probabilities v,? = 1 — u,? of the ground 
state are conserved. 

As in the Hartree-Fock method, we also diagonalize the one-particle parts here, but 
now in the quasi-particle formalism. Since the particle number is no longer sharp, we 


5.4 Fermions 459 


want to obtain at least its mean value correctly, and therefore introduce the chemical 
potential js as a Lagrangian parameter (see p. 560). TheHartree—Fock—Bogoliubov 
equations read 


(w| Ho +V -uN |v’) = (vv) e. 


Here v and v’ are either both positive or both negative, time-reversed states being 
orthogonal to each other in any case. The Hamilton operator should be Hermitian 
and invariant under time reversal. Then according to p. 314, we have (v"| Ho |V) = 
(V| Ho |')* = (v| Ho |v’), so 


Hy = So (vl Ho lv’) UE Wy + UE, Wo) + (D| Ho |v’) YE Wy + (v| Ho 1D’) WE yr 
vv'>0 


In order to determine e,, we need only the part with the factors 


Wi wy + Y, We = 2v,? (wiv) + (wu —vyv) (1 ©, +01, 5) 
+ (uyvy tv uy) (oi 61,40, Py) 3 


The remaining terms of Ho do not contribute to the matrix element above, because 
they have opposite signs of v and v’. 

Only the terms with pairwise positive or negative vuv'u’ are important for V = 
Prau vew aval V |V W)a Yi UT Yy Wy, viz., 


+4 d,s (al V [yw a (VI UE Wy Wy + WE, WE, Wy Ya) 


v’ 


vuv u> 
tyt t yt E 
+5 JO aul V Dua V) YE Yy Yu + UE, Yi, Y Yo) 
vuv'u'>0 
=+ $ Gliva] V [vu)a + aul V [Dh}a) vo? vy 
vu>0 
+ D5 a(Dv] V [pa Uv Vv tp Vu 
vu>0 
+ JO Gwal V p'u) + (VE V VE) vy 
Ki {+w Uy — Vy vy (0! Py F bi, Pr) 


+n vw + Vy Wy )(®} OF, + Dr &y)} 
+ JO av) V [Eua Vu tu 
vaw =0 {—(Uy vy + vy Uy (OF by + D$, Dy) 
+u, uy — vy ww) (Ot DI, + rD), 


where we have left out terms with 4 quasi-particle operators, because we do not 
need them in the following. The quasi-particle representation of the particle-number 


460 5 Quantum Mechanics II 


operator N was already given at the end of the last section. With this we have all the 
terms necessary for the Hartree-Fock—Bogoliubov equations. In particular, with the 
abbreviations 


WIP Ww’) = P (awal V vuja + aVELV Ea) v? 
u>0 


Aw =— Xav] V IE b)a UnVu » 
p>0 


the expectation value of the energy in the ground state is 


(BCS| H |BCS) = X` 2v? (v| Ho +T |v) — uy vy Aw - 


v>0 


New compared with the Hartree—Fock expression are the terms Ayy, i.e., they are no 
longer neglected in the Hartree—Fock—Bogoliubov method. However, in addition to 
the one-particle energies, the occupation probabilities v,? = 1 — u,* must now also 
be determined. They follow from the Hartree-Fock—Bogoliubov equations 


ey (v|v’) = Uy Uy —Vy Vy’) (v| Ao+T—-p |v’) F uy Vy Vy uy’) Avy , 
0= (A Vy tVy uy) (v| Aot+T u |v’) Uy, Uy —Vy Vy) Aw - 


The states |v) are required to diagonalize the operator Hj + I’, whose eigenvalues 
€, + ware the Hartree—Fock one-particle energies: 


(v| Ho +T |v’) = (vv) (ep + m). 


In addition, we restrict ourselves to the pair force as the residual interaction and 
assume an attractive pair force (see p. 456): 


Apy = (wv) Ay, with A, >0. 
Then the Hartree—Fock—Bogoliubov equations read 


ey =+(uy," = v?) Ey + 2uyVy Ay , 
0= — (u? E v?) Ay + 2uyvy €v. 


With u,” + v? = 1, we set u, > cos Pv, Vy — sing, and make use of the prop- 
erties of the trignonometric functions: Uy~ — v, > cos(2,), 2uyVvy > sin(2@,). 
The second Hartree—Fock—Bogoliubov equation then delivers cot(2g,) = €,/Ay: 
gy decreases from 2/2 to 0 between £, «K —A, and £, > A,. According to the 
first Hartree-—Fock—Bogoliubov equation, the quasi-particle energies e, are never 
negative, and with sina = (1 + cot? a)~!/? and cosa = cota - sina, it follows that 


(see Fig. 5.13) 


5.4 Fermions 461 


Fig. 5.13 Effects of the pair 
force. Quasi-particle 
energies e,, for equidistant 
one-particle energies as a 4 
function of the gap 
parameter A (left) and 
occupation probability of the 2 
BCS ground state as a 
function of ¢,,/A (right) 


ey + Ev ey — E 
e =tVe+A,?, Uy = > WS Qe . 
v 


2e, 


For A, = 0, we do not find pair effects, but the usual Hartree—Fock result: either 
u, = 0, v, = 1, and e, = —e, Or u, = 1, v, = 0, and e, = +£,. While the Hartree— 
Fock one-particle energies ¢,, evaluated at the Fermi energy u, can be positive or 
negative, the Hartree-Fock-Bogoliubov eigenvalues e, are always positive. 

Generally, the pair potential satisfies A, Æ 0. Then the Fermi edge is not sharp, 
and that alters the states close to it. Thus there, the quasi-particle energies e, = 
(e,* + A,?)!/? are different from the Hartree-Fock energies ¢,. An energy gap A, 
appears, and only above this gap are there quasi-particle levels. Note that the energy 
gap corresponds to the rest energy mc? in the expression E/c = yp? + (mc)? for the 
energy of free particles according to special relativity theory (see p. 245). 

The gap parameter A, with A, = — Žao a(Vv|V|Uh)a UuVy has to satisfy the 
so-called gap condition (or gap equation) 


= A 
A=- * a(Vv| V |Mh)a TA : 
eu 

u>0 


It is mainly the terms with u ~ vp that contribute to the sum, because u,,v,, is 
only different from zero close to the Fermi edge. In addition, the matrix elements 
a(Vv| V |f4), are particularly large for v © u, hence so is the gap parameter A, for 
v X vp. The pair interaction can thus only be felt close to the Fermi edge. As long as 
we are only interested in states close to Fermi edge, we may use an average matrix 
element 


G = -,(vv|V liu), , forv © ve, otherwise zero. 


Then the gap parameter A, no longer depends on the state |v), and the gap condition 
simplifies to 


2 L Oh tA 


462 5 Quantum Mechanics II 


In addition to the trivial solution A = 0, there is another if 


The pair correlations grow stepwise with increasing pair force G, and hence every 
perturbation theory fails. 


5.4.6 Hole States 


So far we have described the transition of a fermion from |v) to |v’) using the operator 
Yi y. Such “particle scattering” occurs for small excitation energies only close to 
the Fermi edge, and in fact preferably with e, < ep and ey > ep. Here we assume 
a unique Fermi edge—in atomic and nuclear physics (for non-deformed nuclei), we 
take closed shells, otherwise at least an even fermion number so that the ground state 
is not degenerate. We denote this “normal state” by (0). 

Removing a particle from the state |v) turns this normal state into a hole state 
|v—!). It behaves with respect to momentum and angular momentum like the state |v). 
Instead of particle scattering, we may thus also speak of particle-hole pair generation 
(0) —> |v~!v’). Below the Fermi edge, we also use hole operators ®, and above the 
Fermi edge, the particle operators W as before, with 


BO) = '), 6,0, ++ DiD, = wm), 0,0, =-0,0,. 


With pi = W;, whence ®, = yi, pi = —\Ų,, and 0; = —wi , they barely differ 
from the BCS quasi-particle operators. Here, for states below the Fermi edge, we 
carry out a Bogoliubov transformation of all field operators with u, = 0, v, = —1 
(see Fig. 5.14). 


5.4.7 Summary: Fermions 


The treatment of many-body systems with fermion creation and annihilation opera- 
tors was explained using the example of the Fermi gas. The best one-particle basis 
derives from the Hartree-Fock equations. For pair forces, it is better to use quasi- 
particles and the Hartree-—Fock—Bogoliubov equations to introduce pair correlations. 


5.5 Photons 463 


Fig. 5.14 Feynman graphs 
with hole states. Hole arrows 
point downwards (time 
reversal of the particle arrow 
in Fig. 5.11). Upper row: 
The four diagrams for a 
one-particle operator (T), 


viz., pair creation, pair 
annihilation, hole scattering, 
and the vacuum expectation 


~ A 


value (0|T|0). Lower row: A 
selection of two-particle 
operators, viz., particle-hole 
and hole—hole scattering, 
particle scattering with pair 
creation, and the one-particle 
potential 


5.5 Photons 


5.5.1 Preparation for the Quantization of Electromagnetic 
Fields 


The electromagnetic field is described classically by the Maxwell equations. Accord- 
ing to p. 215, for homogeneous non-conductors, they deliver wave equations for the 
electric field strength E and the magnetic flux density B, and likewise for the scalar 
potential ® and the vector potential A. In the following, we restrict ourselves to 
homogeneous and isotropic media, hence constant scalar ¢ and u. 

According to quantum theory, we have to alter our notion of waves to permit a 
particle interpretation—radiation may exhibit interference effects, but it may also 
be granular. This can be obtained only via uncertainties: the experimental quantities 
have to be replaced by Hermitian operators with suitable commutation behavior. 

For the wave function we prefer the four-potential instead of the field strengths 
E and B, because, from the relations 0B/dt = —V x E and V - B = 0, we see that 
their components are not independent of each other. These two equations are already 
automatically satisfied with the ansatz E = —dA/dt — V® and B = V x A. How- 
ever, the potentials cannot be measured and also depend on the gauge, but then the 
wave functions for electrons are not measurable and contain an arbitrary phase. 

It is better to characterize free particles by their momentum (wave vector) than by 
their position. Therefore, we now consider the Fourier transform of the fields and take 
the Coulomb or radiation gauge k - A(t, k) = 0. Then the transverse parts of the field 
strengths E = —dA/dt — ik® and B = ik x A are —dA/dt and ik x A and their 
longitudinal parts are —i1k® and 0. For any other gauge the vector potential also has 
a longitudinal part. Note, however, that the Coulomb gauge is not Lorentz invariant. 


464 5 Quantum Mechanics II 


If we do adopt the Lorentz gauge, we encounter other difficulties in quantum theory, 
because the Lorentz condition cannot be transferred to operators. Then we have to 
introduce longitudinal and scalar photons, which are not easily normalized (see, e.g., 
[8]). Here Ejong = —ikp/ (ek?) holds, according to the third Maxwell equation. 

We now consider the energy W = if d?r (E - D + H - B) (see p. 211) and the 
momentum P = f d?r D x B (see p. 215) in a non-conductor, i.e., with o = 0 and 
j = 0, as well as D = cE and H = B/w. According to Parseval’s equation (p. 23), 
for the energy 


ðA* ðA A 
Aa A*-AĄ 
ðt ot vm ). 


was fakarE } eB B) =E [ak ( 


with transverse gauge, and for the momentum 


3 $ : 3 oA* 
P(t) =e | dk (E* x B) = —ie | dk k 3; A. 
According to p. 216, we have 
A(k —iwt) + A*(—k iwt 
eye ee ee ad, 


2 
and thus dA(t,k)/ot = -iw {A (k) exp(—iwt) — A*(—k) exp(+iœt)}. For the 
energy, we may replace the integrand A*(—k) - A(—k) by A* (k) - A(k) and for 


the momentum, k A*(—k) -A(—k) by —k A*(k) -A(k) (a variable transformation), 
to deduce the time-independent expressions 


W= = fa o? A*(k)- A(k) , 


P= = fa wk A*(k)- A(k) , 


since the oscillating factors cancel for the energy and the momentum—in the latter 
case, for the symmetry under k <> —k. This distinguishes the results calculated with 
potentials from those calculated with field strengths. 

Because of the spins, we also have to consider the angular momentum: 


jaafar r x (Ex B). 


Here we replace only B by V x A, but not E by —9A/ðt for the time being. If E. 
and r, are now treated as constant, then 


Ex(VxA)=VE-A-E-VA, 


according to Sect. 1.1.8, 


5.5 Photons 465 


rx{Ex(VxA)}=-Vx(E,:Ar)—-E-Vr.xA 


=-Vx(E,-Ar)—E-VrxA+ExA. 


The volume integrals of V x (E. - Ar) andE- V r x A can be changed into surface 
integrals. Then there is initially only one more volume integral, of r x A V - E, but 
the electric field is source-free here. These surface integrals f df x r E- A and f df - 
Er x A pick up the orbital angular momentum of the fields. They depend on where 
the origin of the position vectors lies and do not have a component in the direction 
of propagation. This is different with the volume integral of E x A = A x dA/dt. 
Here, using Parseval’s equation, we arrive at the eigen angular momentum 


A(t, k 
S=e f ekaa x C: ) 


Since only terms even in k contribute to the integral, the parts oscillating at 2w cancel 
again, and we find 


S= [ak w Arde x A(k) . 


The result S(k) = —te w A* (k) x A(k) is useful for the helicity S(k) - ex. 
Because of the transversality, on p. 218, we already introduced two mutually 
orthogonal unit vectors ej and e, with ej x e} = ez, and shortly after that also 
complex unit vectors e+ « (ej + ie, )/ V2. There, however, we did not determine the 
phase factor, which we now adjust to the spherical harmonics Y D (Q). Lete (Q) = e, 
be the unit vector in the direction of Q = (6, g). Then, with eọ = i ez, we require 


Q) = 4r YO = i cos form=0, 
ên - e (92) = z! m (2) = vA sin exp(+ig) for m = 1. 


According to p. 332, we always took the factor i! for the expansions of functions 
f(x) in terms of spherical harmonics. If, for k in the z-direction, we choose e| = ex 
and e, = e,, then we have ez -ej = F i/V2 and e,-e, = 1/2. Therefore, for 
the expansion in terms of circularly polarized light, we take 


eE ie, 
e+ = F1! — s= = e, 


with the properties 


eï- e+=1, e{ xe, = ie, 
0 


o 
P 
| 
| 
oO 
© 


1 xX e+ = 


The amplitudes for the two helicities are then 


466 5 Quantum Mechanics II 


Ax(k) =ex*-A(k) <= A(k) =e;A,(k) + e_A_(k), 


and hence we deduce the two equations 


A*(k)- A(k) = |A+(k)|? + |A- (k)? , 
A*(k) x A(k) = (JA; (k)? — |A_(k)|’) iex . 


We can also give the contribution of the respective helicities to the energy and the 
momentum, as soon as we know the amplitude of A+. 

Actually, for e+, we should also include the argument k, because we need to note 
that es*(—k) x ex (—k) = +i (—e,), and ex (—k) = eł (k) = e (k). With this we 
deduce 


AGK) = y z A, (k) exp(—iwt) ma exp(+ior) | 


A=St 


or A, (t, K) = HA, (k) exp(—iœwt) + A,*(—k) exp(+iœt)}. Here we also have 


A,(—k) = e, (k) - A(—k) . 


5.5.2 Quantization of Photons 


Clearly, the two quantities |A+ (k)|? depend on the intensity of the radiation field. 
Classically, in the wave picture, they may take arbitrary values > 0, but in quantum 
physics, only natural numbers. There are only integer light quanta, no fractions of 
them. We usually speak of photons rather than light quanta. 

The properties of these photons can be read off from the previous expressions for 
energy, momentum, and helicity densities in k-space: 


W(k) = 38 œ? {|A,(k)/? + |A_(K)/7}, 
P(k) = je ok {A4 (k)|? + |A_(K)/?} , 


Hk) = 36 œ {A4(k)[? — |A_(kK))’} . 


The ratio of their energy to their momentum is thus w/k = c. According to relativity 
theory (see p. 245), for all massless particles we can state that photons do not have 
mass and therefore move with the velocity of light. 


5.5 Photons 467 


If we now assume the known Planck—de Broglie relations for single photons, viz., 


E=hw and p=AhAk, 


then the density of the quanta with helicity à = +1 is obtained as 


prlk) = $= AW. 


The angular momentum in the motional direction thus yields +h. We distinguish 
between two helicities or two sorts of photons. In fact, they all have spin one, but it 
is oriented only in or opposite to the direction of motion, not orthogonal to it—this 
is a relativistic effect, which relates to the Lorentz contraction. With integer spins, 
they are therefore bosons. (Electrons also have only two spin states, but they are 
fermions.) 

The integral f d?k p, (k) in the classical calculation does not need to be an even 
number. But in the particle picture, we have to enforce this by a special quantum con- 
dition, viz., for photons we have to take creation and annihilation operators satisfying 
the Bose commutation law: 


[W.(k), Wi (k)] = (k, AIK’) and [Wk Wy(k’)] = 0. 
According to p. 450, in the Heisenberg picture, the time dependence is given by 
W(t, k) = Y, (k) exp(—iat) . 


Since we are dealing with bosons, several photons can be in the same state |k, A). 
From the expression for the particle density, which we understand as the expectation 
value of W? W, we deduce the assignment 


Y, (k) = [> Aik) and Wi(k) = [> AK). 


The Hamilton, momentum operator, and helicity operator then follow with w = ck: 


H= J dk ho {Wi (k) Wy (k) + W" (k) W_(k)} , 


Vk) Y4 (k) + (kK) Y (k)}, 


P= f ak nk (w 


H = | &k (Wi (k) Y (k) — Yİ (k) U_(k)} . 


The vector potential has now become an operator: 


2h Y, (t, k) + W(t, —k) 
Ak = y) e 4 3 a 
Azt 


468 5 Quantum Mechanics II 
Hence it follows that A *(t, k) = A(t, —k), with e,*(k) = e,(—k). The transverse 
electric and magnetic field operators are then obtained from E = —dA/dt and B = 


ik x A (as well as from Weber’s equation): 


Y, (t, k) — Vİ (t, —k) 


E(t, k) = i/2ho/eé > 


5 ; 
Ast 
Y, (t, k) + Wi (t, -k 
Bt, k) = iy2ħou Y ek x e a + il ) 
Ast 


where we can also use ieg x e, = A (e,* x €,) x e, = àe}, although this does not 
always help. 

In order to make the transition from k to r, we consider arbitrary Cartesian com- 
ponents n, unrelated to k, instead of the helicities, and investigate [Y, (k), y, (k’)]. 
This is equal to ay en : ex €% < €w (k, Alk’, A’). Because of the last factor, we may 
restrict ourselves to A = A’. Here >, e, eù - ey is the part of e, perpendicular to 
k/k = eg, which, according to p. 4, we may thus write as e,, — ez €z - €w. Therefore 
we deduce 


(Walk), WIEN] = nv — en €k x ev) (KIK’) , 


as a generalization of [Y; (k), yi, (k’)] = (k, A|k’, a’). 
For the fields A and B, there is the sum of Y and Y+, and for E, their difference. 
Therefore, the commutation laws are different. In fact, 


0= [A,(k), An (k’)] = [En (k), Ew (k’)] = [B, (k), Bw (k’)] 
and [A,(k), By(k’)] = 0, but also, using (kA| — k/A’) = (k| — K’) (À| — A’) and 
e_, = eù as well as Weber’s equation, 
/ h f 
[An(k), En (k )] = iz (San! — On + CK €k ex) (k| = k) , 
f h / 
[E,(k), By(k)] = z (en X ev) -k (k| —k). 
Here we have at last made use of a e, €% (ey x k) =e, x k. 


After a Fourier transform k — r, the corresponding operator functions of r, rather 
than k, are 


1 
J2n 3 


where the last equation corresponds to the classical relation A(t, r ) = A*(t,r) => 
A(t, k) = A*(t, —k). With Y(t) = Y (0) exp(—iat) and W*(t) = W*(0) exp(iat), 


A(t,r) = 


[ee exp(ik-r) A(t, k) = A'(t,r), 


5.5 Photons 469 


it is often useful to decompose the fields into the so-called positive-frequency part 
At (t, r) and negative-frequency part A~(t,r) = A (t, r): 


A(t,r) =A (t,r)+ A7 (tr), 


where 


AT(t,r)= 


1 3 h z 
ak Hog? aiae 


and likewise for the electric and magnetic fields. 
In real space, with the transverse delta function 


ôm (Œ) = [ee (nn — en ` €k Ck Cy) exp(ik-r), 


(27)? 


we have the not so simple commutation laws 


nn' 


Po ai 
[An(r ), Er x^] = > oeir = r’) ‘ 
1€ 
[E,(r), By(r’)] = : € <eaevitess); 


Integration of [E,,(r), By, (r’)] over a space element around r’ yields zero. Electric 
and magnetic field strengths at the same position commute, and equal components 
(n = n’) of E and B also commute everywhere. Note that A(r) and —£E"™ (r ) may 
be taken as canonical conjugates, provided that we have ensured that the fields are 
transverse. 
The transverse delta function clearly has the following symmetries: 
Leas (r) = joe) = bans (Lr) . 


nn! 


trans 
nn’ 


In addition, it is source-free, 1.e., me 08S /Ax, = 0, because 


Sok (nn ai knkw /k?) = 0 > with k? = > ae Š 


To relate this to the usual delta function, we consider 


dk Kinky (ik r) a? dk exp(ik r) 
= exp(ik-r) = . 
Qay K P OXnOXy J OTY k2 


According to p. 410, the right-hand integral is equal to (4zrr)~!. We thus have 


ə ıl 


Ox, 0X, 4nr ” 


Sin (T) = Sant Ô) + 


470 5 Quantum Mechanics II 


where, according to p. 172, 


3? 1 < Bx, Xue / r? =. San’ Onn’ o(r ) 
IXnðXw 4nr 4rr? 3 


Thus it has to be accounted for even when n Æ n', otherwise we would also have to 
split off the factor dj, 

All commutation laws have been derived here for equal times—and in the 
Schrédinger picture, the field operators do not depend upon time. To avoid inte- 
grals and improper Hilbert vectors, we must consider a finite volume V and periodic 
boundary conditions, as in Sect. 5.4.1. 


5.5.3 Glauber States 


According to Sect. 4.2.8, the commutation law [W, Y] = 1 leads to the eigenvalues 
n € {0, 1, 2, ...} of the operators W'W, and for a suitable phase convention to 


Wn) =|n-1I) Jn 4 Yla) = |n $1) Vn + 1. 


If n is the particle number, |0) corresponds to the vacuum state, W is an annihilation 
operator, and Yt is a creation operator. 

In Sect. 4.5.4, we used these operators for linear oscillations and set X = 
xo (W + W")/2, P = po (Y — W")/(2i). Since we are dealing here with canonically 
conjugate quantities, for which the scale factors are not essential, we now consider 
the components 


w+ F vw y 
Aj = 7 = Á] and A2 — —— =A, r 


If in particular the mean value (expectation value) A, oscillates harmonically, then 
so does A>, but with the phase shifted by 2/2. The commutation law [Y, WYŻ] = 1 
delivers 


[A1, Az] = 4i, 
and thus, according to p. 300, the uncertainty relation AA; - AA% > 1/4. 
In this and the next section, we shall consider in detail those states whose uncer- 


tainty product AA; - AA% is as small as possible, thus “as classical as possible”. 
Then, according to p. 300, we must have 


Apel sa a 
(Ai — i= ta ee 2) |W) . 


5.5 Photons 471 


In this section, we restrict ourselves to AA; = AA» = 1/2 and hence to Glauber 
states (which were in fact introduced by Schrodinger much earlier [9]), also called 
coherent states, although this is somewhat misleading, because all pure states can be 
superposed coherently. They are particularly important for the electromagnetic field 
(the “photon states”) of lasers. In the next section we shall consider the more general 
case AA; Æ AAg, and in particular, quenched states. 

With the field operators Y |y) = |y) Y from above, the constraint (A; — A1) |W) 
= —i (Ay — Ap) |W) reads: Glauber states are eigenstates of the annihilation operator 
W. This operator is not Hermitian. Therefore, we need a complex number in order 
to label the eigenvalue. œ is normally used, and we shall follow that here: 


Wla)=|a)a, with (aja) =1. 
Then (a|W* = a*(a|, and consequently, 
(a| Aı læ) = Rex and («œļ|A |æ) = Ime , 

or œ =A, +iA>. Note that, when X = xọA; and P = pọÁ2, we also have a = 
X /xy + iP /po, so we take the two real phase-space components of the one-dimensional 
oscillation as a complex number. 

We can create the Glauber state |œ} with a unitary operator D(œ) (the exponent is 
anti-Hermitian) from the ground state |0): 


D(a) = exp(aW'—a*¥), with D’ (œ) = D(—a) = D(a). 


Using the property DÝ (~) YD(a) = Y + « 1 (the Hausdorff series, see p. 290, only 
contains two terms here), D(@) is called the displacement operator. It leads to 


WD(a)|0) = Dæ) CY + a)|0) = D(@)|0) æ , 
so 
|x) = D(a)|0) . 
Here, according to p. 290, we may factorize, hence, 
D(a) = exp(aW") exp(—a*W) exp(—4]a|’) , 


and use exp(—a*¥)|0) = |0) along with W*"|0) = |n) Jnl: 


|n) . 


= _ 1,2 a" 
la) = exp( zl) Dal 


472 5 Quantum Mechanics II 


Incidentally, D(a + £) does not simply factorize into D(a) D(B), because a phase 
factor also occurs: D(a + £) = exp{ilm(a*8)} D(a) D(B). This yields 


D(a) D(B) = exp(@B* — a*B) D(B) D(a) . 
Consequently, we also have 
(a|o’) = exp{—} |æ — a’ |? + ilm(a*a’)} . 


The eigenstates of the non-Hermitian operators W are thus neither countable nor 
orthogonal to each other. They nevertheless form a complete basis. We only have to 
integrate over the whole complex plane. Instead of dRea dIma, however, we write 
for short d?« and take «œ and œ* to be independent of each other. Then 


2 
[Z o=. 
T 


If we expand the left-hand side in terms of the complete basis {|n}}, then we obtain 
(næ) laln’) = exp(—la|*) a” a*” Jn! n'!, with œ = aexp(ig), or d?a = a da do 
and 


d2a f oo art" +l e72 1 2m ; 
— (nla)({a|n') = — da — aA? do. 
[SF wem- [<< aa = f 7 


The last integral is equal to 27x 5, and, for n = n’, the one to the left of it is equal to 
1/2 (if set we x = a’, so that dx = 2a da, then i x” exp(—x) dx = n! leads to the 
result). The double integral is equal to (n|n’). 

So far we have always taken orthogonal bases and, for continuous variables, have 
arrived at simple integrals. But now the states are no longer orthogonal to each other 
and we require double integrals. The basis {|~)} is said to be over-complete. An 
arbitrary state can be decomposed in terms of these, but no longer uniquely, because 
the basis states now depend linearly on each other. Hence, e.g., for alln € {1, 2, ...}, 


n da n Pa n 1 2 
Jo) = ¥"10) = | => W" la) (a0) = | — la) a" exp 31a). 


Consequently, there are even infinitely many linear combinations of states |œ} which 
may result in the zero vector |o). 
In the Glauber state |œ}, the operator N = W*W has the expectation value 


(lN Ja) = lal? , 


and, with N? = Wi(W'W + 1)W, the uncertainty AN = |a|. Note that this increases 
with |a|, but the relative uncertainty AN/N = |a|~! decreases, as expected for the 


5.5 Photons 473 


transition to classical mechanics. For a harmonic oscillation, we obtain the result 
H = ho (|a|? + 1/2) and AH = ho |a]. 

Furthermore, the probability for the Fock state |n), with sharp particle number 
and unsharp phase, depends only on the modulus of a: 


ja?” 
Knjæ)|? = exp(—|a|?) gas 


This is a Poisson distribution p, with mean value (n) = |a|? and uncertainty An = |a| 
(see p. 519). 

For the time dependence, using H|n) = |n) ha (n+4) and |a@(0)) = |ao), we 
obtain 
2 iot a (ape y" 


2 n=0 vil 


la(t)) = exp |n), 


whence (a(t)| Y |a(t)) = ap exp(—iwt), and its complex conjugate for the expecta- 
tion value of Y. Consequently, we have 


X() =xoRe(ape") and = P(t) = po Im(œo e™) . 


The Glauber states oscillate harmonically with angular frequency w and with fixed 
position, momentum, and energy uncertainties. Ehrenfest’s equations are also valid 
here. 


5.5.4 Quenched States 


We now allow for AA; 4 AA3, but keep searching for further states with an uncer- 
tainty product AA; AA, as small as possible. The necessary equation mentioned in 
the last section can be reformulated as the eigenvalue equation 


. AA; 7 . AA; we : + 
(Ai + laa 42!) = |y) (Ai + Nag , with A, =A, . 


But the non-Hermitian operator is now composed linearly of the annihilation operator 
W and the creation operator WÝ. Therefore, we consider the Bogoliubov transforma- 
tion, but now for boson operators, with u = u* > 0 and v = v*: 


Saute => Paw +. 
Note that a common phase factor is unimportant, so we may choose u = u* > 0, and 


v Æ v* would then lead to AA; - AA; > 1/4. 
With [®, $t] = (u? — v?) [W, YÏ], we also require 


474 5 Quantum Mechanics II 


Fig. 5.15 Uncertainties of the squared components (A, horizontal and A2 vertical) for the Glauber 
state (z = 0) and quenched states (z = +1/2). As in Fig. 4.20, instead of the AA values, we now plot 
ellipses of the same area with corresponding principal axes, where the product is AA; - AA? = 1/4 


w-vw=1 «<> [6,o]=1. 


For u = coshz and v = sinhz, this is possible with a single real parameter z (see 
Fig. 5.15). Recall that, for fermions, we had u? +v? = 1 (see p. 457), and therefore 
circular instead of hyperbolic functions. Note also that, with u > 1, we are no longer 
allowed to choose u = 0 and then replace Y by ®t. Conversely, then Y = u® — vot 
and Wt = ub! — vo. 

The Bogoliubov transformation can be carried out by a unitary operator S: 
® = S WS". In particular, if we set S = expA with AT = —A with St = S~', then 
according to Hausdorff (see p. 290), A follows from 


® = uY + vÏ =coshz Y + sinh z WYŻ 
=SYSİ = Y + FIA, Y] + FIA, TA, Y+, 


since here only [A, W] = z has to hold, and thus [A, ¥*] = zW. Consequently, 
A = $z (Y? — YÏ?) up to an arbitrary phase factor in S. The quench operator (or 
“squeeze operator”) 


2_ yyi2 
S(z) = exp oo 


affects the ratio AA; /AAz, as will now be shown. 
Corresponding to the Glauber state |) (eigenstate of Y) is a quenched state S|a), 
an eigenstate of ©. With S = SW, we have in particular, 
Sja) = Sla)a. 


In order to be able to employ |æ} = D(@)|0), we investigate the product SD(q@) with 
D(a) = expla Y? — a* Y) = f (Y, UW"). Since 


Sf (UY, VINS? = f (SWS, SWIS") =f (È, ®t), 


5.5 Photons 475 
we also have Sf (Y, YW") = f (®, ")S. Here, 
ad’ — a*® = (au —a*v) VW" — (au—a*v)* Y . 
If we therefore set 
B = ua — va* => a =up+vp", 
we find that SD(a~) = D(B) S, so for the eigenstate of ® with eigenvalue a, 
S |æ) = S D(a) |0) = D(B) S |0) . 

For the quasi-vacuum (“quenched vacuum”), we have S|0), hence ®S|0) = |o). 

The expectation value of the operator F (Y, YÏ) in the quenched state S|q) is thus 
the vacuum expectation value of 

D’ (a) St F(Y, UW") S D(a) = S* D (B) F(Y, YDB) S . 
In the term DÝ (8) F(Y, UW") D(B) = F(Y + B, Ut + 8*), itis now useful to change 
the representation Y > u% — vO! and YÍ > ud? — vO, with (0|O'St = (o| and 
®S|0) = |o}. Thus, it follows in particular that 
Wy=6,  (W)=f—uw, (Ww) = [BP +’, 


and for the Hermitian operator F = fY + f*W", 


(F) =fB+f*B* and AF = |fu — f*v| = [f Iv cosh 2z — cos 2¢@ sinh 2z , 


where f = e? |f |. For the two components in the quenched state S|) with f = 1/2 
orf = —1/2i and u > v, we find 


AA; =4(u—v) and AA, =4(u+y). 


Here we have u + v = exp(#z) andhence AA, - AA, = 1 and AA; /AA2 = exp(—2z). 
Quenched states are appropriate, e.g., when comparing two oscillations of dif- 
ferent frequency, because their ground states have AX = 5X0 and AP = 5Pos with 
Xopo = 2h, but xo and pp = v 2ħmoæ depend upon the given frequency. In an “inap- 
propriate basis”, the oscillation appears compressed (or expanded) (see Fig. 5.16). 


The quenched states are formed under parametric amplification (in this context, 
see the discussion of parametric resonance in Sects. 2.3.10 and 2.4.11), with the 
Hamilton operator 


476 5 Quantum Mechanics II 


Fig. 5.16 Influence of the quench parameter z on the particle number (continuous red) and its 
uncertainty AN = y |uß — vp*|? + uv? (dashed blue), here shown for the quenched vacuum. 
The average particle number is in fact then as small as possible for a given z, but greater than zero 
forz #0 


exp(iwpt) Ww — exp(—iwpt) wr? 


H =howW'W — ik z : 


or with H = ho W'W + th {exp(iwpt) Y? + exp(—iwpt) ¥'7}, which can lead 
back to the former under the phase transformation Y — exp(—iz/4) Y. Here w 
is the angular frequency of the considered light and œp = 2% that of the pump light, 
while « gives the (real) coupling constant. The pump light is described classically 
here, with fixed intensity, and in this sense, the above Hamilton operator is “semi- 
classical”. This will be discussed in more detail in Sect. 5.5.6. We thus have the 
Heisenberg equation 


= io Y +x exp(—2iat) W" . 


It can be solved by carrying out the time-dependent Bogoliubov transformation 
W(t) = exp(—iot) {cosh(«t) Y (0) + sinh(«t) YY (0)} . 


The phase factor is unimportant here. This therefore leads to quenched states. For 
the photon number operator N (t) = Yİ (t) Y(t), we have 


Yi? (0) + W*(0) 


N(t) = sinh? (kt) 1+ cosh(2«t) N(0) + sinh(2«t) 5 


If there is no light initially so that N (0) = 0, then the average photon number increases 
as sinh? (x«t), although the result for long times is certainly not correct, because the 
pump light cannot supply energy inexhaustibly. 


5.5.5 Expansion in Terms of Glauber States 


In Sect. 5.5.2, we gave different observables (e.g., E, B, H, P) as functions of the 
field operators W, W*. If we now expand the operator f (W, W") in terms of Glauber 
States, 


5.5 Photons 477 


+ da da’ ee eee 
fO, Y) = | — — fa) (al fC, ') la) (a , 
qT T 


then we may evaluate the coefficients, if f(W, ¥") is normal ordered, i.e., in all 
products, the creation operators occur to the left of the annihilation operators: 


FY WY) = SOP wre, 


Then (a| f (WY, YŻ) je’) = > fF a*a" (ala). With the abbreviation 
fP, a“) = DDA aa" , 
it follows that (œ| f (WY, YŻ) |a’) = f(a’, a*) (ala) with 
(ala) = TETEN + ilm(o*a’)} , 
according to p. 472. 


The operator f (Y, WY) may also be anti-normal-ordered, with the creation oper- 
ators to the right of the annihilation operators: 


f,Y’) = bes ww, 


rs 


Then for the function f (YW, ¥"), just one double integral (over d’q) suffices. If we 
insert the unit operator between W” and YÏ, then we obtain the important relation 


7 da 
Fw, wh) = I < F(a, a”) loyal 
with 


F(a, a*) = IFO aa” . 


rs 


Here we have f ® (œ, a*) 4 f(a, a*), as can be recognized, e.g., from f (Y, ¥1) = 
wut = WW + 1, because then f(a, a*) = |a|?, f(a, a*) = |a|? + 1, and so 
fO (a, a*) = f(a, w*) + 1. More general than WWT” = WI"W +n Y”! is 


wn win (+)'/m! n! wi n—l wal 
win ot = 2 i! tn — D! (n = D! he wi n—l : 


as can be shown by induction (see Problem 4.20). 


478 5 Quantum Mechanics II 


Note that, from f(a, @*), we can also determine f(a, a*), but we cannot 
determine f(a’, a*) fora’ # a: 


(yy! g* da 12) pla) * 
f™ œ, a) = = expCle-al sf (a, a"), 


with f(a’, a™*) = (a'| f (Y, Y?) fa’) and | (æla) |? = exp(—|a—a|*). 
Generally, we may set 


ty dé t * (n) * 
FOU, WT) = J — exp!) expt) FOE, E) 
d2 3 ; 
= f = exp(—§"W) exp(EW") FOE, &*) , 


with the expansion coefficients 


FOE, E°) = trlexpE*W) exp EY) F(Y, YD), 
FOG, &*) = trlexp(—§W") exp(g"W) F(Y, Y’). 


If we replace f(W, Wt) in trfexp(é*W) exp(—EW") f(W, Y")} by the normal- 
ordered double integral f d?&’ m~! exp(&’W") exp(—&/*W) F™ (&', €’*), then we 
arrive at f d°&/ m ~'F (&’, &’*) tr{exp[(E — £)" Y] exp[—(€ —&’) W"]}. If we insert 
the unit operator f d?a 2~!|a)(a| between the two exponential functions in the 
trace, then we obtain fea nT! exp{(E—&')*a@ — (E —&’)a*}, and the exponent is 
2i Im{(§ —&’)*@}, thus equal to 2i Re(§ —&’)Ima — 2i Im(é —&’)Rea. In this way, 
we arrive at the Fourier expansions of delta functions of the real and imaginary parts 
of 2( —é’). This is easily integrated over &’, and we arrive at F™ (£, €*). The proof 
for F® (E, &*) is very similar. We thus obtain the Fourier transforms 


(n) * Pa * * (n) * 
FOG, E = fZ exp(é"a — Ea") faa"), 
d2 
F(a, 0*) = f CÈ expla — a) FOE, 6"). 


Note that we usually require the normalization factor 27 for the Fourier trans- 
form. Here z suffices, because the factor of 2 is already contained in the expression 
2Im(é*a) = Im(2é*ar). Of course, the relation between F® (£, €*) and f(a, a*) 
is also a Fourier transform. 

We have the trace of the anti-normal-ordered products exp(é* WV) exp(—é Y+) for 
F® (£, &*), and that of the normal-ordered products for F(&, &*). In both cases, 
the product of the exponential functions and of f(W, WÝ) can be reformulated as 
a normal-ordered product of powers of W and YÏ, and the unit operator inserted 
between the two factors. 


5.5 Photons 479 


5.5.6 Density Operator in the Glauber Basis 


If we set likewise 
oY, BT) = SF pw? wwe = pO ww’, 
rs rs 
for the density operator p(W, Wr), then 
tr (ww wi’) = i Pa alas ta 
implies the equations 


Noe EE eee ee: = a PE E 
FCP, 8") z? (a, a”) f ™ (œ, a") P (a, a”) f(a, a") , 


where one normal and one anti-normal-ordered operator always occur, like covari- 
ant and contravariant components for the scalar product. Since f (Y, ¥*) = 1, we 
therefore also have f d?a p™ (a, a*) = f da p(a,a*) = x. 

As for the Wigner function (see Fig. 4.7), the different representations of (f) 
suggest introducing quasi-probability densities, and in particular, the P-function 


(a) * 
Pye ee ate [ee P@)=1, 
TT 


and the Q-function (or Husimi function) 
O(a) = —, with [ee O(a) =1. 
It then follows that 
(F, Y’) = / Pa Pw) f(a, a*) = i Po O(a) f(a, a*). 


Since p = f da P(a) |æ) (a|, but |æ) (a| does not project on orthogonal states, the 
P-function is only a quasi-probability density. The Q-function does in fact have the 
properties of a probability density, i.e., itis real and never negative, with p™ (œ, a*) = 
(a| olæ), but does not lead to the full density operator. 

Very useful here are also the normal-ordered characteristic function 


C™(E,E*) = (exp(EW') exp(—é* W)) 


480 5 Quantum Mechanics II 
and the anti-normal-ordered characteristic function 

CE, E*) = (exp(—E*W) exp(EV")) . 
These can be used to derive the moments at the position Ẹ = €* = 0: 


arts 
og* F aEs ` 


artsc™ 


rytsy — fF 
JEDE aE and (W'W"'*) = (-) 


(WW) =) 
The two functions are related, because, according to p. 290, 
exp(éW! — E*W) = exp(EW") exp(—E*W) exp(—41&|) 
= exp(—&*W) exp(EW") exp(+5lé!) , 


so C™(E, E*) = C(E, E*) exp(\E|7). The characteristic functions are the Fourier 
transforms of p (œ, a@*), so 


d2 
COE, E) = / <* expla" — g*a) paa"), 


d? ; 
p™ (a, a*) = J = exp(é*a — Ea*) CE, &*) , 


T 


and likewise C™ (£, £*) and p® (æ, w*) are Fourier transforms of one another. 
Let us consider some useful examples: 


(1) Clearly, for the Glauber state |œ}, we have 
CME, E”) = exp(Ea* — £a), with p= |a)(ar| . 


(2) For the laser, a superposition of these states with equal amplitude and unknown 
phase arga = @ is important. We have to average over ọ to obtain p = |a) (a|. 
Then we arrive at C™(€,&*) = + ig dg exp{|Ea|(e~'” — e!?)}. With z = |2€a| 
and t = exp(—ig), the integrand can be expanded in terms of regular Bessel functions 
J,(z), because they have the generating function 


=r! 
2 


CO 
exp(z ) = > Jaz) t, fort #0, 
n=— 00 


with the symmetry J_,(z) = (—)” Jn (z). If we expand exp(5zt) and exp(—4z/t) in 
series, we obtain the regular Bessel functions 


E œo (=) (z/2)" +% 
Jn(Z) = 2 katko , 


5.5 Photons 481 


Fig. 5.17 Regular Bessel Bessel functions J, 
functions and irregular 
Bessel functions, also called 
Neumann functions, for n 
from 0 (black) to 3 (blue) 
(continuous for n even, 


dotted for n odd). x 
Asymptotically, Jn (x) ~ 

2 ETEN 

cos{x — (n + 7)77T} 

TX 
and Np (x) © 

= sin{x — (n+ 4)47} 

TA lee x 


as shown in Fig. 5.17. Note that the spherical Bessel functions F(z) mentioned 
on p. 401 are Bessel functions of half-integer index, viz., F(z) = J/1z/2 Ji41/2(2)- 
From the last equation, 


exp(iz sing) = 5 Jn(Z) exp(ing) . 


n=—00 
With this we obtain 
CE, E) =la), with p= Ja) (o] . 
The anti-normal-ordered function C® also contains the factor exp(—|é|”). 
(3) The quenched state S|a) has the normal-ordered characteristic function 
(or|S* exp(E Y) exp(—E*W)S|a) = exp(5|&|") (0|S'D! (B)DE)D(B)S|0) , 
with 6 = ua — va*. Here, according to p. 472, we have 
D'(B) DE) D(B) = exp(&B* — €*B) DEE) , 
whence 
C™(&, &*) = exp(slé? + EB" — E*B) (01S? exp Y? — E*W) S |0) . 
As on p. 475, we replace €W* — &*W — (ug + vé*) Ot — (u£ + vE*)* ®, and the 


vacuum expectation value is found to be exp(—5|ué + vé*|*). So in total, for the 
quenched state, 


482 5 Quantum Mechanics II 


2 x2 
COE, E) = exp( 8" — Bet — vee" -mw ETET, 


2 


This leads, e.g., to the expressions mentioned in connection with Fig. 4.21. 


(4) According to p. 580, the canonical density operator 


(N)" 
p= 2 (N +141 |n) (n| 


with (N) = {exp(ħw/kT) — 1}~' andthus (N + 1) = {1 — exp(—hw/kT)}~! is asso- 
ciated with the temperature T. Hence, | (a|n)|? = exp(—|a|?) |a|?”/n! implies 


Pud=ene=— a 
, (N + 1) (N +1) 


This means that C® (E, &*) is the Fourier component of a Gauss function, thus also 
a Gauss function, according to p. 23: 


CO (E, &*) = exp{—(N + 1) 1£1°} - 
The normal-ordered function C™ (&, &*) also requires the factor exp(|&|”): 


lol? 
(N) 


(n) *) — _ 2 (a) * € 
omg, &") = exp{—(N) |E} = p“ (a, a”) 7 aac 


for the canonical distribution. 


5.5.7 Atom ina Light Field 


We consider an atom with two eigenstates {| ), | 4 )} at the energies +iħøoa and 
a light field with the energy quantum wg. The atom can be described using Pauli 
operators ø and the field using Bose operators W, Yt, and for the coupling —p - E, 
the dipole moment with oy = o, + o_ and the field strength with i(¥ — W"), if we 
combine all remaining factors into the real factor thg. The phase transformation 
W — iW changes —i(W — W") into Y + Y". In comparison to o} Y + o_W", the 
parts o} YÏ + o_W couple to states of much higher frequency, viz., @,+@a instead 
of œ — wa, and therefore do not contribute to the time average. Note that o} V 
describes induced or forced absorption, and o_W* induced or forced emission. With 
this we arrive at the Hamilton operator of the Jaynes—Cummings model: 


H= Shon o, + hay, VIY + thg (oY +o0_W'). 


5.5 Photons 483 


Fig. 5.18 Eigenfrequencies 
w+ in the Jaynes—-Cummings 
model as a function of the 
detuning A = œL — wa, 
each relative to wa, and here 
for g/wa = 0.1 


Above the ground state | |, 0), with energy —shoa, it couples the state pair | +, n) 
and |}, n+ 1), where n is the photon number: 


H\|t,n) =A{(nt+5)oL— 5A} | tn) + shgvnt1 |J.ntl]), 
H\{,n+1) = ħgvn+1 |f, n) + h{(n+ i)o + 5A} | ).n4+1), 


with detuning A = œw, — wa between the light field and the atom. According to 
p. 309 (3trH + iy (trH)? — 4 det H), the eigenvalues of H are (see Fig. 5.18) 


w+ = OL (n+4) + 5Qq ’ 


with the generalized (to A 4 0) Rabi frequency 


Qn = VJV(nt+1)2e2+A2. 
According to p. 310, the eigenstates associated with this doublet are 


|+,2) = |4,n)cos6, +|{,n+1) sind, , 
|—, n) = —| f, n) sin 0a + |), n+ 1) cosh, , 


where cos 0, = /1—A/Q,,/V2 and sin 0, = VIFA /Qn/v/2. They are thus eigen- 
states of YW + o,0_ with eigenvalue n+ 1. For the remaining expectation values, 
we can use 


A Jai 
cos(20,) =-= and sin(26,) = vg , 


For example, the matrix elements of Y’W and o, = 20o}0—— 1 between the basis 
states with 


(WTW — o,0_)|+, n) = |+, n) (n F cos(28,)) + |F, n) sin(20,) 


are easy to evaluate, and their time dependence is known to be exp(—iw+t). If initially 
either the state |f, n) was occupied (upper sign) or the state |}, +1) (lower sign), 
it follows that 


484 5 Quantum Mechanics II 
(oz) = + (cos”(26,) + sin?(26,) cos Qat) . 


In particular, only for a resonance (A = 0) do all atoms end up in the other state. 
This is of course true for other initial light fields, e.g., for the Glauber state |œ). If 
initially the state |f, œ) was occupied (upper sign) or the state ||, œ) (lower sign), we 
arrive at the weight factors exp(— |a|?) |a|?"t'*!{(n + 5  4)!}71. We shall restrict 
ourselves to the case |a| >> 1. Then the weight factors for n + 5 F 5 x |a|* — 5 are 
particularly large (Stirling’s formula on p. 518 is used in the proof), and therefore 


for an approximate calculation we use the generalized Rabi frequency 


Qu = fal? +) g+ A? 


in cos?(26,) = (A/a) = | — sin? (20). But for the sum over cos(&,t), we have 
to calculate more precisely by one order. Here the abbreviations 


ee ee Qu and K= wei 
30, g 


are useful, because then for the important terms we have Q, ~% (k + n) œ, and this 
leads to the approximation 


(0o;) = +[cos? (204) 
+ sin? (20x) exp{— la|? (1—cos(wt))} cos{kot) + |a|? sin(wt)}] , 


with the upper sign for the initial state |f, @) and the lower sign for |}, @). Here, in 
the time 51 /w, the factor exp{—|a|* (1 — cos wt)} decreases from one to a negligibly 
small value. The oscillations observed for the Fock state stop after this time, and set 


in again at the time 277/w (see Fig. 5.19). 


hh ‘ 
8 ptt, 10 12 gt/2r 
Sota ee ee AW RIM RW Ae 
vyltlipy 
v uy 
v 


Fig. 5.19 Absence and return of the excitation of an atom in a light field described by a Glauber 
state. Initially, ||? = 10 and the atom was in the ground state. Continuous curve: resonance. Dashed 
curve: Detuning A = |æg|. Here (o,) = 0 indicates that on average there are equally many atoms 
in excited states as in the ground state 


5.5 Photons 485 


This absence and return (“collapse” and “revival’’) occurs only with the unsharp 
Rabi frequency {Q2,}, as can be seen by comparing with the semi-classical ansatz: 
only the atom is treated according tos quantum physics, but the field classically. 
Its Hamilton operator describes an illuminated atom (quasi-atom or dressed atom), 
which is the expectation value of H with respect to a Glauber state |œ}: 


A = (a|H — ho, Y Ya) = hago, + thg (ao, +a%o_). 


Note that we have taken he |o|* as the zero energy and subtracted, as usual. Here, 
according to p. 473, the quantity œ = |a| exp(—i@ ft), and consequently also H, 
depends on time. But this can be eliminated by a unitary transformation U(t) = 
exp(siat oz). Here we go over to a reference frame rotating with the light wave. 
The the rotating-wave approximation (RWA) neglects the terms o} ¥* + o_W, and 
with the new axes, we arrive likewise at the time-independent Hamilton operator H. 
Since U depends on time, using Problem 4.22, we find 


H = UHU +ihUU't = ihu ox— 4ħA 0, with w=lalg. 


Its eigenvalues are E+ = HAO with Qy = y u? +4A?. Using 
tr(oH 
Q, = e a ae PN 


in the equation for the Bloch vector (ø ) = tr(po ) deduced from the von Neumann 
equation ihp = [H, p] on p. 343, we find 


According to the semi-classical ansatz, the Bloch vector thus rotates about the vector 
Qa in the reference frame rotating with the light frequency, and with this a complete 
change from (o,) = +1 to (o,) = +1 is only possible for resonance. But since the 
Bloch vector rotates about Qg for arbitrarily long times, there is no absence and 
return semi-classically. 

So far we have not considered spontaneous emission (the coupling to the remain- 
ing modes)—and this is often more apparent than the absence or the neglected terms 
o,W* + o_W. Fora two-level system, it is easy to write down the differential equa- 
tion, according to p. 381 for T = 0: 


do  [H, p] a lo_p,o,]+[o_, pox] P [0zP, 0z] + h.c. 
dt ih 2T 2To ` 


486 5 Quantum Mechanics II 


This implies the Bloch equation (see also Problem 4.22) 


d(o) (1+4t/t)(ox ey + 05 €y) + 2(0:+ 1) ez 
Saa = Qa x (0) 2r , 


or again, writing y~! instead of 21 and setting B = 1+4t/to, 


d(o) . fy A 0 

—— = 0 (F) — 2ye, with O= A —By -u 

dt 0 2 
u —2y 


The previously skew-symmetric operator O thus obtains some diagonal elements: its 
eigenvalues are not purely imaginary, and its real part leads to damping. The inverse 
of O is 


4 w+ 2By* 2yA -nA 
—1 2 
= 2yA 2y”? —Buy 
24,2 2 2 
y (2B2y? + 2A? + Bu?) -uA Buy Bye 


Using (ø ) ~ 2y O7 !e,, the z-component of the stationary final state is 


Bry? + A? =i] =j 


O) Z = = ; 
Oo) ~~ ayz At Be 1+ Ipay + AD TFT 


since u? = |ga|? is proportional to the light intensity Z. The saturation intensity Is is 
clearly proportional to By? + A?, so at resonance (A = 0), it is particularly small 
and increases quadratically with the detuning A. For J < Is, (oz) approaches —1, 
which corresponds to the lower energy state, but for J >> Is, it tends towards 0, the 
two states then being equally probable. For the rotating-wave transformation, the 
z-component is conserved, while 


A+ iBy 
By? + A2 + 5 By? 


(0x + ioy) Xu 


becomes constant due to this transformation (see Fig. 5.20). 

We have considered spontaneous emission only semi-classically. In a full quantum 
mechanical treatment, we would also have to describe the electromagnetic field using 
operators (W, WÝ), and hence assume the Jaynes-Cummings model. In addition to 
the considered damping, we would also have to include terms [Vp, ¥*] + [W, p W"]. 
This damping couples the Jaynes-Cummings doublets and can be solved analytically 
only with further approximations. 


5.5 Photons 487 


Fig. 5.20 Motion of the Bloch vector for the illuminated two-level atom (from out of the ground 
state) using the rotating-wave approximation in the (y, z) plane (top view) and (x, z) plane (side view): 
left for resonance (A = 0) and right with detuning (here A = u). Dashed curves indicate without 
spontaneous emission (dissipation, y = 0), and continuous curves with spontaneous emission (here 
y = u/10 and B = 1). Without dissipation, a circle is obtained, otherwise a spiral with the attractor 
indicated by the open circle. For resonance, the quantization axis lies in the plane of the circle, 
otherwise not (so the right-hand circle for detuning is inclined, and smaller) 


5.5.8 Summary: Photons 


As an example of a many-boson system, we have considered the light field and quan- 
tized the classical Maxwell equations, thereby investigating the quantum properties 
of a classical field. Instead of the occupation-number representation, we prefer to 
take Glauber states, which are “as classical as possible”. Then as polar coordinate 
we have the amplitude and phase of the field and we do indeed find oscillations, in 
contrast to states with sharp energy. 


5.6 Dirac Equation 


5.6.1 Relativistic Invariance 


The Dirac equation is a relativistic equation. Therefore, we use the notation with 
four-vectors known from electrodynamics (Sect. 3.4). The position vector with its 
Cartesian components 


x*: Ql, x)= x,y,z), with ke {1,2,3}, 


is amended with a further component x° = ct (the “light path”), to yield the four- 
vector x with contravariant components 


xt: x! x73) E (ct, x"), with pw e€ {0, 1,2, 3}. 


Correspondingly, the components of the mechanical momentum p are (see p. 245) 


488 5 Quantum Mechanics II 
a [E 
PY: ppp) (=. p*) 
and those of the vector potential A are (see p. 239) 
0 4l 42 ay a7? ot 
AM: (AAA? AY) = (Z, At). 
c 


If we consider a particle with charge q in the electromagnetic field, its mechanical 


momentum differs from the canonical momentum P, which has components (see 
p. 247) 


P” = p +qA". 
Apart from the contravariant components (upper index), we also need the covariant 


components (lower index). These can be derived for the pseudo-Euclidean metric of 
special relativity theory using the metric tensor 


1000 
0-10 0 
Ww — — 
(g )= 0 0 —| 0 = (8w). 
00 0 -i 


We shall always use Einstein’s summation convention from now on, and thus leave 
out the summation sign whenever the summation index in a product occurs once up, 
once down. For the present case, x,, = guy X” and hence xp = x, x, = xÉ. 

The Lorentz invariant scalar products are sums over products of covariant and 
contravariant components. In particular, for free particles, we have 

vv, =c, andwith p“ =mvw", also p“p, = me. 

Here m is the mass of the particle under consideration. With p“ p,, = (p? — p-p, 
we thus have for free particles 


(E/c) = (mð? +p- p. 


However, we shall generally use the equation p“ p, = me. 


5.6.2 Quantum Theory 


In the following we have to replace the observables by Hermitian operators, but we 
shall use the same letters. In particular, p should mean the mechanical momentum 
and P the canonical momentum. Here we have to account for the fact that P does 


5.6 Dirac Equation 489 


not commute with A. Therefore, for all bilinear equations, we shall restrict ourselves 
initially to the case q A = 0 and treat the generalized case only in Sect. 5.6.8. 

The Dirac equation is a relativistic equation for a wave field yw which we shall 
interpret as a probability amplitude. For the superposition principle to remain valid, 
the equation has to be linear in y. In addition, if yy (to) is given, everything at later 
times should be fixed. Consequently, it must also be a first order differential equation 
in time, and relativistic covariance then allows only first derivatives with respect to 
the position. We note that the Schrédinger equation also contains only first derivatives 
with respect to time, but second derivatives with respect to the position. 

According to the correspondence principle, we have to obtain classical mechanics 
in the classical limit of special relativity theory. However, we cannot use the equation 
P"Py = m*c*, because taking into account 


Ad oO dis 
P, Sih eas or p, =ihd, —qAz . 
it leads to a differential equation of second order, i.e., the Klein—Gordon equation 


[10, 11], derived also by [12] and [13]. According to Dirac [14], we should make an 
ansatz with a linear expression in p,,: 


. q 
(y“py—mec)py=0, or Gy" dy — 5 y“Ay—k) Yœ) =0, 
where «x = mc/h. (Itis common practice to set i = c = 1 and put the mass m instead 


of k, even though the Compton wavelength 27 /x is a well known quantity.) Note 
that, setting 


Peay”). 


together with p, = (E/c, —p*) and A, = (®/c, —A*), we have on the one hand, 


E 
Psy ae Pons 
® 
yrA,=y°— - V-A, 
c 
but on the other, 
0 
“Heyo-—t+y-V, 
K w= c Ot y 


where 9, = (3/(cðt), V*). 

We could also have written the Dirac equation in the form (y“ pa + mc) y = 0, 
because the only restriction is p“ p, = (m c)*. In this bilinear equation, we would 
have to restrict ourselves to qA = 0—the generalization to qA # 0 will follow in 
Sect. 5.6.8.) We must now deal with this ambiguity. 


490 5 Quantum Mechanics II 


5.6.3 Dirac Matrices 


The novel feature in Dirac’s ansatz is to take the square root of p“ p,,, i.e., to require 
P" Pu = (Y” Pu). This equation requires y“y” + y’y" = Oforp A vandy4y4 = 
g"*, if we assume that all the y“ commute with the operators considered so far. 
The four quantities y” must therefore anti-commute, i.e., they cannot be normal 
numbers. If we make an ansatz with matrices, then y must have correspondingly 
many components. We combine the last equations to give 


yey tyr ye a 2e 


which is the basic relation defining a Clifford algebra. On the right, we should write 
the unit operator, but we shall leave it out for many of the following equations. 

If only three such operators were necessary, then we could take the Pauli matrices 
discussed on p. 308, viz., 


1 [Or 2_ (OS s (10 
aslie =o)» %=(04). 


Note that, for u € {1, 2, 3}, we should also have a factor +i for o” 2 = —] to hold. 
Together with the unit matrix, these form a complete basis for 2x2 matrices. Con- 
sequently, the Dirac matrices must have a higher dimension. 

Since the squares of the y“ are equal to +1 or —1, we can form a total of 16 
different products. These include unity and the four operators y”, plus six 2-products 
yty” with u < v, four 3-products y*y“y” with A < u < v, and finally, the 4- 
product 


y=iy*y'y*y*. 
The index 5 is commonly used, since jz is sometimes allowed to run from 1 to 
4 instead of 0-3. In contrast, authors vary in the use of the factor i. In any case, 
the abbreviation for the four-product is suggested because y“ y5 + y>y“ = 0 and 
(y5)? = 1. Therefore we shall also set g’> = g°“ = 0 for u Æ 5 and g5 = 1, which 
is not common practice, and then generalize the starting equation [y”, y”]_ = 2g”. 
As basis operators, we prefer to use 


o = sly 


and this is also not standard practice. Given that o“” = —o”, this introduces 10 
new quantities for which we have included a factor of i. For u Æ v (including 5), we 
then have o4” = iy”y”. We also have (again including 5) 


5.6 Dirac Equation 491 


yey? =g" io", 
y”y”y“ = gt? y“ + gr y" _ gry” 4+ S ee. oP ; 
A<p 


where e” p = ELIEN gig p/p, and e“"*"? is the totally antisymmetric Levi-Civita 
symbol. In particular, £°!?35 = 1. For the commutators, we now have (also with 5) 


[y", y’] =-2io” , 
aes idl = 2i (g a y”), 


lo, oH”) =i (gk ot +g go“! _ gio = git o”) : 


and for the anti-commutators 


yoy a 2a, 
yoo hati ee 
rA<p 


lo, oh), = Pe", y’ + gt gò” _ g” gi) ’ 


In the last equation, the Einstein convention implies a sum over p. Fork AAA UF 
kK, we now obtain 
KX 


oo = —o# o =i go and oo = gg 


For the space-like components, we have o? o”? = —0 a? = —io? = io! (and 
cyclic permutations in 1, 2, 3), as is usual for the three Pauli operators, whence the 
letter o has been adopted. 

We have thus introduced 16 operators y“. Of these, only the unit operator com- 
mutes with all the others, while each of the others commutes with eight and anti- 
commutes with the remaining eight. Only the unit operator commutes with all four 
operators y”. 

The traces of the 15 operators y^ without the unit operator all vanish. This is 
immediately clear for the ten products 0”, because tr[A, B] always vanishes. For 
the five remaining ones, using 2i g“ y” = [y*, o*"], we arrive likewise at a com- 
mutator and can therefore infer vanishing traces here. 

Each product of two operators y^ and y” is (except for the sign and possibly a fac- 
tor of i) equal to one of the 16 operators. These 16 operators are linearly independent, 
because if $`, a4 y^ = 0 were to hold, then any one of them could be multiplied by 
some y”, and by forming the trace, we could conclude that ag = 0. Clearly, the linear 
combination gives zero only if all coefficients vanish. Therefore, the 16 operators 
are indeed linearly independent. 

All 16 matrices y^ are unitary for a Hermitian Hamilton operator. To show 
this, we multiply the starting equation (y“p,, — mc) = 0 from the left by c y? 
and use (y°)? = 1. We have y° (y“p,, — mc) = po — y° (y -p + mc), and with 
cpo = iħ3/3t — q®, it then follows that 


492 5 Quantum Mechanics II 


ə 
in any, with H=q®+ y°(y -cp + mc’). 
The Hamilton operator H is only Hermitian for y? = y? * and y? y% = (y? y*)* = 
ytt y®, so yk? = —y* and y>* = y>. In what follows, we shall instead use the 
equation 


y"t=y°y"y’, for we {0, 1,2, 3}. 


Consequently, we have y” Tye = | for u € {0, 1, 2, 3, 5}. With this, the remaining 
operators o“” are also unitary: 


This is what we set out to prove, and with this we have also derived the Hamilton 
operator of the Dirac theory. 


5.6.4 Representations of the Dirac Matrices 


Since we have arrived at 16 linearly independent operators y“, we are dealing with at 
least 4 x 4 matrices, which may all be written as (super-)matrices of the Pauli matri- 
ces, the 2 x 2 zero matrix, and the 2 x 2 unit matrix. In the standard representation 
y? is diagonal, in the Weyl representation y* is. For each, we set 


kl __ o” 0 i A 12 o 0 
o“ = ( 0 a thus in particular o ^ = (% 3 ; 


with (k, l,m) = (1,2,3) or a cyclic permutation thereof, and o” the 2x2 Pauli 
matrix. Except for these three matrices (and the unit matrix), the two representations 
are different (see Table 5.2). 

These representations can be unitarily transformed into each other using the oper- 
ator U = (y? + y®)//2 = Ut = U7!. It relates the first and third components, or 
again the second and fourth components: 


(di, Wa, Wa, Wa) <> i + Y, Ya + ya, Yi — Ys, Ya — Wa) /V2. 


With y°y* = —io, the Hamilton operator reads H = q® + y? (y - cp + mc?) in 
the standard representation 


® 2 o. 
Hy = q@+a-cp + pme = (4 + mc os 


o-cp gq®—me 


and in the Weyl representation 


5.6 Dirac Equation 493 


Table 5.2 Standard and 
Weyl representations of the y 
matrices 


4x 4 matrix Standard Weyl 
representation representation 


The standard representation is in fact convenient for low energies, i.e., for |o -cp| « 
mc, but otherwise the Weyl representation is to be preferred, not only for neutrinos 
and quarks which may have very small masses, but because Hw is easier to diago- 
nalize. The helicity o - p/p is a good quantum number for massless Dirac particles 
(even for g@ A 0)—neutrinos are left-handed and anti-neutrinos right-handed. 

Later, we shall also need the complex-conjugate Dirac matrices for the anti-linear 
operators used to describe time reversal and charge conjugation. Therefore, for u = 0 
to 3, we now consider 


yt= By Bt => yp*=-By g. 
This fixes Z only up to a numerical factor. We may choose & unitary: 
se eae 


This fixes the modulus of the numerical factor, while its phase remains free to be 
chosen. 

The operator 2 depends on the representation of the operators y”. Z has to com- 
mute with the real y” (for u € {0, 1, 2, 3}) and to anti-commute with the imaginary 
ones. Then conversely for y3, e.g., y Z = —By? for realy>. In both the represen- 
tations considered above, y°, y!, y?, and y5 are real and y? is imaginary, thus in 
both cases Z «x o>, and only the phase factor ‘remains open. We choose 4 = o” 
and Z real: Z = Z* and thus Z~! = Zi = Z. 


494 5 Quantum Mechanics II 

In any case, # is antisymmetric in both representations: 

B=-B. 
This actually holds in all representations, because each unitary transformation leaves 
ZB unitary and antisymmetric. If y = Y yY t holds with y* = & yB" and y* = 
U'y*U = By'B', then also y’ =U A'U By (UB'UB)". There- 
fore, YB'YB' commutes with the four y’”, and consequently, according to 
p. 491, it is a multiple of the unit. Except for a phase factor, 4’ = Y* BW", so 
B =-B for B= -ZB. 

The complex conjugation operator .% also depends on the representation. In 
contrast to &, it acts on all degrees of freedom, and according to Sect. 4.2.12, it is 
anti-linear and anti-unitary: 

KcK t= and #*#1=H1= 4H. 
But the product “#4 = ZH does not depend on the representation. Here, 
(ABF = KHBHK'B=BRB, 
and since 4* = —¥™!, we have generally 
(KBY=-1 and (KA) =(KA)'. 
For u € {0, 1, 2, 3}, we have in addition X By" = Hy"B= y KH B, while 


H BY’ =—-KHy*Ba=-pPHRB. 


5.6.5 Behavior of the Dirac Equation Under Lorentz 
Transformations 


The equation 
(y"Pp—me)w =0, with yy” +y’y" =28", 


is written in a relativistically covariant way. For a change of coordinates x4 > x'", 
we have 


(y'“p, — me) =0. 


5.6 Dirac Equation 495 


The notation y“p,, indicates a relativistic invariant (a scalar). Hence for a homoge- 
neous Lorentz transformation (see p. 232), 


x" =a", x”, with a”, ap = gè =a, a, and a#,*=a",, 


the Dirac matrices have to transform as a 4-vector, viz., 


v 


yt =at, y”, 
and with y“ y^ + y*y* = 2 g“^, it follows that 
w 


yt y +y” y" =a", a”, (yey + y*y*) =2a a’, =2g 


We deduce that o”” transforms as a tensor of second rank, and the unit as a scalar. 
We now prove that y3 =iy°y!y?y? transforms as a pseudo-scalar if y/“ = 
a", y”. It suffices to show that 


yo=y> deta, 


because, according to p. 228, all proper Lorentz transformations have deta = +1, 
while for a space inversion, we have det a = —1. The properties of the determinant 
can be described with the totally antisymmetric tensor €,a yy: 


Exu deta = Exayy 4“ w ay anyway. 


The matrix y" can be taken as g Eka Y“ yè y! y” and y’> as g kay V“ y'> ye 
y'”, whence 


95 i K à H v Re Ne E T 
Y = gêm Axa wawy Y Y Y =Y deta. 


The claim is therefore proven, and o> œ yy? is an axial vector, or pseudo-vector. 

With [y’, y’]4 = 2g” = [y’", y'"]4, itis usual to take the same Gamma matri- 
ces and transfer the transformation to the states Y. After multiplication from the left 
by Z, the transformed Dirac equation (y’" p’,, — mc) Y = 0, with 


y= Py" Z, 
becomes 
"p, —me)w' =0, wih W= ly. 


We may thus always calculate with the same Gamma matrices, if we transform the 
states suitably. 


496 5 Quantum Mechanics II 


In order to determine the form of Z, we start with Z~!y“Y = a", y”. If we 
take the Hermitian conjugate and multiply on the left and right by y?, then it fol- 
lows that Yo Pty p4y° Zl ty? = ayy” = Yy" LF, and with y? = y?! and 
Zli = Zl, we also have Gy Ziy? yt = yt Ly Ly, ie, Ly Cry” 
commutes with all four y” and therefore, according to p. 491, is a multiple of the 
unit: Zy° LY! = b y®. Here b has to be real, because y? and hence also the left- 
hand side are Hermitian. Its sign, with St = y9Y'by® or GIL = by? asy”, 
is determined by a°o, because y? a? y” = a°9 —ia°,o™ and tro”” = 0 leads to 
4ba°) = tr.Z'L > 0. For orthochronous Lorentz transformations, the time direc- 
tion remains unchanged, so a) > 0 (see p. 228) and also b > 0, while for time 
reversal, b < 0. Here, |b| = 1, if we impose the group property that the product of 
two Lorentz transformations is another Lorentz transformation, and hence (as for the 
canonical transformations in Sect. 2.4.3) that det & = 1 has to be valid. Taking this 
together then, only b = +1 remains possible, so 


Li=tyH'y, 


with the plus sign for orthochronous Lorentz transformations and the minus sign for 

time reversal. This means that X is not always unitary, and in fact yty transforms 

as the time-like component of a four-vector, as will be shown in the next section. 
Let us now consider an infinitesimal Lorentz transformation 


auv xX Suv + Onv » with Wuv = Ovu » and On| < 1 , 


and make the ansatz Z ~ 1-— io” Su, whence Z7! ~ 1+ io" Su. Then 
Suv = —S\, remains to be determined. Since on the one hand, 


aly y” = Llyr L m yh — sa (YH Sa — Say"), 
and on the other, 
a” yy = (gM + o”) yy X yh +50 (8'e Va BAM) » 


we infer that —ily”, Skal] = ae VY. — ga Ye. Here, according to p. 491, the quantity 
ge Ya — od Ve is equal to =i" 0x1]. This suggests 


_1 
Suv = 3w - 


However, a term can be added which commutes with the Dirac matrices, hence a 
multiple of the unit. But that contradicts the constraint det - = 1. Consequently, for 
infinitesimal transformations, 


L ~x 1go” Ow 


5.6 Dirac Equation 497 


holds uniquely, and, e.g., for a rotation through the small angle e about the z-axis, 
i.e., witho; = —@ 2 = £, all others being zero, -2 (e) = 1 + 5 € 0o12. With o2 = 1, 
this can be generalized for a finite rotation 7(¢) = #9/*(e) to 


ọ . . $ 
Z = cos Ż E. 
cos 5 +1012 siy 


Here we recognize that these particles have spin 1/2. In particular, for a rotation 
through 27x, the sign switches, and only after two full rotations does the system 
return to its original state. 

From infinitesimal Lorentz transformations, we can obtain all proper Lorentz 
transformations. For the improper ones, we may restrict ourselves to time reversal 
and space inversion, possibly combined with a proper Lorentz transformation, and 
we shall discuss these in detail in Sect. 5.6.7. There we shall also consider charge 
inversion (charge conjugation). We may then understand why the solutions y have 
four rather than two components. 


5.6.6 Adjoint Spinors and Bilinear Covariants 


So far we have been considering the Dirac equation (y“p,, — mc) w = 0. Then, with 
ytt = y?yty?, for u € {0, 1, 2,3}, and (y°)? = 1, the Hermitian adjoint Dirac 
equation is 


V(y"py—me)=0, with psyiy®. 


Instead of the Hermitian conjugate spinors yt, it is better then to consider the adjoint 
Ww of y, because the same operator acts on y and WY, once on the right, once on the 
left. Here, in the standard representation, we have Y = (W1*, W2*, —W3*, — W4"), 
but in the Weyl representation, Y = (W3*, W4*, Y1*, Wo"), where we have set yt = 
(Wr1*, Wo", W3*, Wa*) in both cases. 

In the real-space representation of p, = P,, — gA,, according to p. 489, P,, cor- 
responds to the operator iñ 0,,. With (VIPI, |x") = Pay =mi, Bi, acts 
like the operator —ih ð, — qA, acting on the left. Consequently, we may write the 
adjoint Dirac equation in the real-space representation in the form 


Ghd, +qA,)v y” +mep=0, 
or free of any representation, (P, + gA,) Wy" + mc Y = 0. 


For an orthochronous transformation Yy > Y’ = Z y with Zi = +y°FZ-!y®, 
we have 


Y’ = yt y? = y7 KA y’ = y?y? go = Y gZ! 


498 5 Quantum Mechanics II 
and Wy" Wl = bLo!y" Ly = pyy. Thus, 


w 1 y scalar, 


y y” w vector, 
wo” w tensor, 
wo w axial vector , 


WY y? w pseudo-scalar , 


as was to be expected for the operators y^ according to the last section. 
From the differential equations for Y (x) and W(x), viz., 


y” Ghd, —qAy)w=t+mey, ie, yta, W =-—} (qA y" v+mey), 
Ghd, +qAy) Y y” =-mc ý, ie, pyy” =+} (qA vy" +m), 


we deduce the “continuity equation” 
ða wr" y)=0, 


and according to p. 239, a conservation law for f dr yyy = f dr yy > 0. 
Therefore, we relate the time-like component y y?y to a “density”, in fact the charge 
density, as will be shown in the next section. However, the different components of 
y” do not commute with each other, and therefore the probability current is not sharp. 
This is worth noting for a plane wave, which solves the Dirac equation in the field-free 
space (with A” = 0). Therefore, we often speak here of Zitterbewegung (trembling 
motion), but we should nevertheless explain the fact that y has four components, 
not just two, as would have been expected for spin-1/2 particles. Hence we consider 
improper Lorentz transformations and then treat the phenomenon of Zitterbewegung 
on p. 505. 


5.6.7 Space Inversion, Time Reversal, and Charge 
Conjugation 


For these three improper Lorentz transformations, the Dirac equation keeps the same 
form. However, for time reversal and charge conjugation, we also need here the anti- 
linear complex conjugation operator .% , which already appeared for time reversal in 
non-relativistic quantum mechanics (see p. 313). Since the operator .% does not act 
only on the Dirac matrices, but also on the remaining quantities, we shall now give 
the full transformation operator, differently from the proper Lorentz transformations 
considered so far. 

Under a the space inversion, all polar three-vectors change their sign, while the 
axial vectors do not—so all time-like components remain conserved. Consequently, 
(P'o, P’x) = (Po, —Px) and also (®'(t', r^), A,r) = (O(t, =r), —A(t, =r )). 


5.6 Dirac Equation 499 


The Dirac equation thus keeps the same form if (y°, y*) are transformed into 
(y®, =, This can be done with 


P= Py, 


where “p is the inversion in the usual space, which we already need in non-relativistic 
quantum mechanics. The sign remains undetermined, because a rotation by 27 
changes the sign of y without changing any measurement values. The phase factor 
has been chosen such that 


P =1, 
as in the non-relativistic case. We then also have Y = Zt = Y' and 
("pe — mc) Puy = 0 ’ 


as claimed. 

Under time reversal, (t,r) has to change into (t’,r’) = (—t,r) and (, A) 
changes into (®’(7’, r’), A’, r')) = (®(-t, r ), —A(—t, r )), because the magnetic 
field switches sign for motion reversal. The position vectors remain the same for time 
reversal, but not the momentum vectors. We thus need an anti-linear transformation, 
as was shown already on p. 313. 

In fact, the time reversal operator 7 in real space has the same properties as the 
anti-linear complex conjugation operator -%, but the latter also changes the Dirac 
matrices, as we have seen in Sect. 5.6.4. Only the operator 4 commutes with 
them. F acts like a unit operator in real space. 

For the invariance of the Dirac equation under time reversal (motion reversal), 
we need an anti-linear operator which changes the sign of the three space-like Dirac 
matrices. This we can do with 


T =y AB, 


where the sign is arbitrary. Since (VVA BY = (VOP (X Y) with (y°)? = 1 and 
(H BY = —1, we thus have 


7? =-1 and Z= 7. 


These two properties do not depend on the representation. In both the standard and 
the Weyl representation (with Z = o”), we have to take Z = io’! X. 

Starting with the adjoint Dirac equation (P,, + qA„) wy" + mc = 0 of the last 
sections, we can construct the charge-conjugate solution. In particular, if we take the 
space-inverted matrices of this equation and set 


ye=—U'y"&, for we{0,1,2,3) (SS =+ PD), 


500 5 Quantum Mechanics II 
and if we multiply y” (Pu FgAp) ý + mew = 0 by —Y, we obtain 


(yt (Pa + qgA,) — me} Vp = 0: 


The sign of the charge q has been reversed here, relative to that in the original Dirac 
equation, and this is the required charge conjugation. Hence we infer the charge 
conjugation operator 


C=YUX , 


since with Y = wiy®, we have y = yoy and therefore Y y = -yY X y. Note 
that the phase factor remains arbitrary. 

The properties of the operators Y follow from y“ = —WY-'y"Y, but only 
up to a factor, which allows us to choose Y unitary, i.e., %7! = t. Since 
yÉ = (phy = (yyt y®* = By y"y°B, we must still require y°y“y° = 
— BUT! Yt U B. Thus the three operators y* commute with YZ, while y° 
and y5 anti-commute with it. Therefore, Y Z is proportional to «°°, independently 
of the representation, and consequently Y with Z7! = —Z is proportional too” Z. 
The still missing factor has to have the absolute value one, because Y, o%, and Z 
should be unitary. We can thus write Y = uo% Z with |u| = 1. 

For the charge conjugation operator @, we thus have uyo 5 A Z = iuy H Z. 
In the following, we choose u = —i, whence the charge conjugation operator is 


C= AB. 
Independently of the representation, we thus find 
C= (VP LB) PHB =(ABPYI V  VHB=1, 
along with @ = (VX BY = —(y°)? (X BY = 1, and hence, 
C=¢'=¢. 


The charge conjugation operator is thus unitary and anti-commutes with all Gamma 
matrices except for the unit: @y4 = —y4 @, for y4 Æ 1. Due to the factor X, it 
is anti-linear and therefore @P,, = —P,, €, but CA, = A, C. In both the standard 
and the Weyl representation, we have @ = —iy?.%. 

The common transition 7 Y (= AZ) is thus described by “ Z Z9, and the 
transition. 7 PE (= -C PT) by y’ Yo. Inthe next section, we will see how impor- 
tant the operator y> = y>* is. But let us already recognize a noteworthy property 
of the CPT transformation: with (y>)? = 1 and y“y> = —y>*y", it leaves scalars, 
pseudo-scalars, and tensors of second rank unaltered, while for vectors and pseudo- 
vectors, the sign changes—such statements form the object of the CPT theorem. 

If we denote the charge-conjugate state of |y} by |v) , then by p. 314, 


5.6 Dirac Equation 501 


llel) = (ply) and [e(g| COW" |). = (WI OF lp). 


With @y4@—-! = —y4, for y4 Æ 1, and y4* = y4, this leads to the expectation 
values 


e= (ya), thus (y®), 


(o) =o"). (o) = —(0"), 


Gje = +), (Xo = +X"), (Pije =(P“), (AM), = +{A") . 
Moreover, H = g® + y°{y - c (P — qA) + mc’} yields @ H = —H @, or 
(H(q))c = —(H(—9)) . 


Thus the eigenvalues of the Hamilton operator change their sign along with the 
charge. If we take them as energy eigenvalues, then we necessarily arrive at negative 
energy values and find no ground state. Thus an arbitrary amount of energy could be 
emitted. (However, for time-dependent forces the Hamilton operator and the energy 
operator agree only for a suitable gauge, so we can also require E = |H| here.) 
We can repair this difficulty if we quantize the field and attach zero energy to the 
vacuum. Every particle creation should cost energy, independently of the charge. 
Due to charge conservation, particles can only be created from the vacuum in pairs 
of opposite charge, and with a supply of energy. Then twice the energy is necessary 
compared with what would be required for one particle (disregarding the binding 
energy between the two). 

Here we recall the non-relativistic Fermi gas. In its ground state, all one-particle 
states below the Fermi edge are occupied, while all those above are empty. Adding 
energy raises a fermion from an occupied state to an unoccupied one. The excited 
state differs from the ground state by a particle-hole pair. This picture is also suitable 
for the Dirac theory. We only have to choose the Fermi edge as the zero energy, i.e., 
as a quasi-vacuum. If a particle is missing from this quasi-vacuum, then we have a 
hole, i.e., an anti-particle, which is a particle of opposite charge and energy (see Fig. 
5.21). 

As the quantity adjoint to Ye = G Y, we take Ye = wiy°@. This implies Yey? We 
= pO Cy = -p y YC = —v yas expected, with (Y°). = —(y°). 


5.6.8 Dirac Equation and Klein—Gordon Equation 


We now turn to the problem mentioned in Sect. 5.6.2 that P does not commute with 
A, and therefore additional terms occur for qA 4 0 compared to the Klein—Gordon 
equation 


502 5 Quantum Mechanics II 


Fig. 5.21 Charge symmetry. For charge inversion, the signs of (H), (P), (p), and (ø ) are all 
reversed. The continuum of H eigenvalues of free particles is indicated by dotted lines (left). The 
eigenvalues of p are shown next to it: top for ® and bottom for anti-particles ©, and right next to it 
the same after time reversal (right) 


"puz mc?) =0. 


To this end, it can be advantageous to use the projection operators 


P =} (14y) =P} =P}, PLPe= 0, Pet P= 1, 


They commute with p,,, but not with y“, for u € {0, 1, 2, 3}: 


Pty” Sy" Pz, but Py? =yP = +P 


Therefore, Pey"p, YW = y“p,,P= W also holds. On the other hand, the Dirac equa- 
tion implies y“p,w = mc y, and mc commutes with P+, so from Pẹmcy = 
Pyp Y = y"pyuPsy, we may infer 


(P= + Ps) mc Y = (y" py + mc) Psy , 


where P+ + P+ = 1 and division by mc is allowed for m ¢ 0. From a component 
Py or P_w, we thus obtain the total solution y which has to satisfy the Dirac 
equation. Consequently, 


(y" py — mc) (y” py + me) Pa = (y"“y” Pupo — mc?) Paw =0, 


i.e., each component P4 yw satisfies the same equation. With y“y” = g™” — io” 
and o"p, py = —0 "Pu py = —o"’ py Py = 50" [Dus pv], together with 


[Pus Pv] [Pu — gAu, P, —qA,]=q([Pp, Apl + [An Pul) 


igh (AA, — d,Ay) = —ighFyy , 


Il) Il 


the equation for the components can be reformulated as 


pn =m? — qho" Fu) Pay =0. 


5.6 Dirac Equation 503 


The operator pp, — mc? of the Klein—Gordon equation must therefore be amended 
by the term — sqho"” Fav. This couples the different components of P+ via the 
operators 0”, and in the standard representation, disregarding a factor gh, it reads 


lh pv << Bin. _ o-B —io -E/c 
z (0 Fuy)p =o -B-ia BES 2 ae o-B ; 


while in the Weyl representation, it reads 


, _ (0 - (B-iE/c) 0 
=5 (o" Puw = ( 0 a) l 


Since the projection operators P4 are also diagonal in the Weyl] representation, 


10 00 
E eC 


this leads us to 2-spinors W+ = (P+ w)w, an advantage over the standard represen- 
tation: 


(pp, -me + qho - (BF iE/c)} y4 =0. 


Generally, we have 
p"Pu = (PM — qA") (Py — qAu) = PUP y — q (PEAY + AMP) +q? AMA, 
Now P” commutes with A, for the Lorentz gauge 0,,A* = 0, so it follows that 


PYA, +A"P, = 2A"P,,. In the scalar product, the order of the operators P and A 
is thus irrelevant for the Lorentz gauge and we obtain 


P'py = (E-qey/e—-(P-qA), 
whereupon 


{(E — q9} — e (P — qA? — (mc?)? + qħc o - (cB F iE)} y4 =0. 


In this way we have reformulated the Dirac equation as two similar equations for 
2-spinors, each being an equation for spin-1/2 particles. (In the standard represen- 
tation, the same goal is pursued with the Foldy—Wouthuysen transformation, but this 
proceeds only stepwise and approximations have to be made.) 

How are the components y+ and y_ to be interpreted? To find out, we consider the 
equation # By? = -y° 4H B. It yields EP} = P46. If Pi describes a particle, 
then @P. y describes its anti-particle, which we find as P--@ w in the complementary 
space of the particle. We may thus interpret y4 as a particle and y- as its anti-particle. 

In the non-relativistic limit, we have E — g@ ~ mc? and consequently, 


504 5 Quantum Mechanics II 
(E — q®)? — (mc’)? = 2mc? (E — q® — mc?) z 


In addition, we may then neglect fio - E compared to 2mc ®, since for E = — V 
with Ax - AP > sh, we have 


x < x |S]. 


ho -E h A® AP A® SS 
D 2mc Ax mc 


2mc 


Therefore, in the non-relativistic limit, we find the Pauli equation (see p. 327) 


[E - (mc? + ot age a 8) =O 


Hence there is a real magnetic dipole moment gio /2m, and according to the pre- 
ceding equation, there is also an electric dipole moment, although this is imaginary 
and therefore not observable, as Dirac himself stressed [14]. 


5.6.9 Energy Determination for Special Potentials 


For free motion (qA” = 0), we arrive at the equation 


P-?P?-(m’)y=0 = E=cV(mc)?4+P?. 


Here, the energy does not depend on the spin (degeneracy). In addition to the momen- 
tum, the helicity o - p/p also commutes with the free Hamilton operator (in both the 
standard and the Weyl representation). Therefore the free 2-spinors can be decom- 
posed in terms of their helicity (7 = +1). If p has the direction (0, g), then the 
helicity states, i.e., the eigenstates of (oy cos o + oy sing) sin 0 + o, cos@, can be 
represented by 


where we use the abbreviations c = cos(40) and s = sin(40) exp(ig), along with 
(+|=(c, s*) and (—| = (—s, c). The directions of p and ø are reversed under charge 
conjugation, so the helicity is conserved. 

So far we have had to write the Hamilton operator for the free motion as a 4x4 
matrix 


Hp = ym? +cy*y-P, 


but now we can decompose it into two 2 x 2 matrices, viz., 


5.6 Dirac Equation 505 


H4 = +c (mc)? + P? , 


where we choose H for particles and H_ for anti-particles. 

The advantage of this separation can be illustrated by considering, e.g., the veloc- 
ity. We determine the derivative of the position operator R with respect to time via 
the Heisenberg equation. In the standard representation, this yields 


Hence not all three Cartesian velocity components—each with modulus c—can be 
sharp simultaneously, because they do not commute with each other. This is often 
interpreted as Zitterbewegung. But with[R, f (P”)] = 2ih (af /AP?) P, we also have 
the equation 


dR. [R, H4] cP 
dt ih Aa 


which does indeed make sense, because according to p. 245, for free particles, we 
have p = c-7EvV. The split into particles and anti-particles clarifies this matter. The 
anti-particles move against their momenta here. 

For free motion, the associated 4x4 matrix Hw (see p. 492) can also be decom- 
posed into two 2x2 matrices, one for each of the two helicities n = +1. If we now 
use the parameter t to distinguish particles (t = 1) and anti-particles (t = —1), then 
we obtain the eigenvalue equation 


C cp — tE mce? ) ta 
5 = 0 . 
mc —ncep— TE?) \ Pen 


This leads to the above-mentioned energy eigenvalue (with EF > 0) and to 


Wry  TE+ncp _ mc? 


Pr mc? TE -ncp 


For the normalization, we invoke the invariant Yy = 2Re Wrn Pry, noting that 
Wren |? + lPrn | is not suitable here. Then the expressions just found deliver 


2 
- Prn 2 MC 
=2 2 Re 9 — 
PY = Uber? Reo” = Zeal? ee 
Wry 2 mc 
= 2 2 Re = 2 
Korn Re = Poe ee 


With E > cp, Yy is positive for particles (t = 1) and negative for anti-particles 
(t = —1). We therefore require yy = t and infer 


506 5 Quantum Mechanics II 


Y, lel 

3 ™m=+1 
2 

1 Z 


Fig. 5.22 Large amplitude (red) and small amplitude (magenta). These differ by the product ty = 
+1 and depend on p/mc. In the case of free motion considered here, we have p = my v, with v/c = B 
and hence p/mc = yf and the approximation ,/p/mc (dashed blue) 


E+tncp 


E—tncp 
2mc?2 ` 


and Grrl” = 


2 _ 
Ipren? = ae 
We choose w,,, and z, real, and w,, > 0. With this and with 2Re Wry“ Pry = T, the 
sign of Qr is the same as that of t: 


E+tncp E—tncp 
Yen V ama È Paty a 


again with E > 0. For high energies, E ~ cp and therefore one or the other amplitude 
is negligible—but |Rew7,*@rn| = L, We speak here of large and small amplitudes 
(see Fig. 5.22). 

For the Weyl representation, these expressions are then also to be multiplied by 
the helicity amplitudes mentioned above: 


W+ C =y- s* w-+e =y- s* 
Ws Y+- c y-+s ye 
p++ |’? =p * | -+c |’ =p- s“ 
P14 5 P+- € g-+5 p-- € 


The momentum eigenfunction must be included with all these “internal” wave func- 
tions. 

In a homogeneous magnetic field B = Bọ, the Coulomb gauge is A = 5 Bo x 
R, with ® and E equal to zero, and we have P-A=A-P= 5hBo - L, where we 
introduce the dimensionless quantity L = R x P/A. This yields 


(P — gA)- P- qA) = P? — qñBo - L + gA? 


and 


E2 @ 
a= m) +P- gh By: (L +0) + 7 (Bo x R)’. 


5.6 Dirac Equation 507 


For charge conjugation, q is to be replaced by —q and (L + ø) by —(L+ 0), and 
we thus arrive at the same value of |E|, despite the charge (a)symmetry. 
For the hydrogen problem with g@ = —e?/(47 £o r) andB = 0, itis advantageous 
to take the fine-structure constant 
e ıl 1 


a= = : 
Amey ħce 137.... 


and use the further abbreviation r’ = r/r. With P? = — h?(d?/dr? — L? /r?), we 
arrive at the differential equation 


dd wmet-E* «E2 Leior’-o— 
dr? hc? her r2 


a? 
) vo=0. 


It is similar to the non-relativistic radial equation of the hydrogen problem, investi- 
gated in more detail in [15] (see also p. 422): 


d? 2n =ld+1) 
1 =0, 
(= + P ep uj(p) 


with the Coulomb parameter 7 (not to be confused with the helicity, which we shall 
no longer consider). Normalizability requires 7 — / to be a natural number (1, 2, 3, 
...). We shall denote it by n,+ 1, whence n, gives the number the nodes of the radial 
function. (The zeros at the boundaries 0 and co do not constitute nodes.) 

To exploit this well known result, we now have to express the eigenvalues of 
L? = iar’-o — a’ in terms of A(A + 1). In fact, 4 is somewhat smaller than J, as 
will now be shown. 

The dipole—field coupling œ ø - r’/r* does not commute with the orbital angular 
momentum, but like any scalar, it does commute with the total angular momentum 
(L + S) A, so for the spin angular momentum, we split off the factor A and write S = 
žo. It is therefore appropriate to take the coupled representation | (/ 5) jm) of p. 336. In 
particular, the operator L - o is diagonal, and with L x L = iL, according to p. 325, 
we have (L-o0)? =1?+i(Lx L)-o =V?—-L-0,soV=L-o(L-o+ 1). 

The term Fig r’-o — a? with (r’ - o)? = 1 may also be written 


Fiar’ -o (lx iar’-o). 


Therefore, it follows that 


Vsiar’-o—-oe’ =A(A+4+1), with A=-L-oFiar’-o-1, 


if wecanprovethatL-o r’-o +r’'-o L-o = —2r’-o. Here, according to p. 325, 
the left-hand side is equal to (L-r’+r’-L)+i(Lxr’+r’ x L)-o, and with 
this the first bracket vanishes, because L and 1/R commute and we have R- (R x 
P) = —R . (P x R) = —(R x P) - R. For the second, we may use R-P = 3if + P- 


508 5 Quantum Mechanics II 


R along with [R, P-R] = iAR and [P, R?] = —2ihR. This leads to L x R+R x 
L = 2iR, whence L -ø r'-0 + r’-o L-o is indeed equal to —2r' - ø. We thus 
obtain 


A? = (L-o0+4+1)*-a’. 


The eigenvalues of this Hermitian operator depend only on a”, not on the sign + ia. 
But in our further calculations, we have to distinguish between j = l + 4 and we also 
need the different signs now for another purpose. 

In particular, by p. 372, 


(L-o +1) |(@5)jm) = +|(U5)jm) G+ 5), for j=1+} (E(D, 
and consequently, 
A? |(15)jm) = Um) {G+ 3)? —@*}, 
as well as 
A diyim) = F Uim (G+ 5)? -— oe? }'? for j=l}. 


Note that the sign follows from the limita — 0, whence A tends towards —L - ø — 1. 
If we now denote the eigenvalue of A (A + 1) by A(A + 1), we have 


Jpeg st 211/2 sad 
ee es) ar} G+5 D, 


and hence, 


2 
a 
A=l—-e,, with e =j+i-,/g+iP-a@e 1 
J ‘| J 2 G 5) a J+ 


With this we may now return to the known result of the non-relativistic calculation. 
Comparing the two radial equations with (mct — E*)/(he)* = k? and aE/(ħc) = 
nk leads to 


ak mc” 
=> E= 


" ~ mct — E2 v1 ++ (a/n)? l 


Normalizability now requires 


~ 


n=n+1+A=n-gE, 


with the principal quantum number n = n, + l + 1 (see p. 363). Finally, we obtain 


5.6 Dirac Equation 509 


2 


ce 


mc? ER a 1 3 
E= =m 7 1+ (- I ) dian , 
y1 +a7/(a— a)? n n\j+5 4n 
where j € f, ws N35}, so that 1/G + 5) > 3/4n, and the Rydberg energy (see 
p. 362) 


Ep = 507 me. 


As can already be seen from Fig. 4.18, there is now no degeneracy with respect to 
the angular momentum j (only with respect to the orbital angular momentum /), in 
contrast to the non-relativistic calculation. The terms indicated by dots in the above 
may be left out, being smaller than the effects neglected in the Dirac theory (like the 
Lamb shift mentioned on p. 380). 


5.6.10 Difficulties with the Dirac Theory 


In fact, the Dirac equation describes electrons (and neutrinos) better than the 
Schrödinger equation, because it accounts for relativistic effects and spin (although it 
is still not the end of the story). In particular, it also holds for anti-particles (positrons), 
and their energy spectrum is reflected at E = 0. There are thus infinitely many states 
of negative energy, with no lower bound. In particular, the free Dirac equation allows 
any energy above mc” and below —mc?, but none in-between. 

Dirac suggested viewing the vacuum as a many-body state, where all states of 
negative energy are occupied and all states of positive energy empty. If this vacuum 
is excited by more than 2mc? (through photon absorption), then a particle switches 
from a state of negative energy into a state of positive energy. This creates a particle— 
hole pair, which may be interpreted as electron—positron pair creation. Conversely, 
there may also be pair annihilation, where a particle makes a transition to a hole state 
and emits electromagnetic radiation. 

Even though pair generation and annihilation may be described with the hole 
theory, the Dirac equation leaves some questions open. In particular, it cannot be 
a one-particle theory. The many particles of negative energy should interact with 
each other. In addition it remains to be clarified whether electrons or positrons have 
negative energy. These problems can only be tackled by field quantization. 


List of Symbols 


We stick closely to the recommendations of the International Union of Pure and 
Applied Physics (IUPAP) and the Deutsches Institut fiir Normung (DIN). These 
are listed in Symbole, Einheiten und Nomenklatur in der Physik (Physik-Verlag, 


51 


0 


Table 5.3 Symbols used in quantum mechanics II 


Quantum Mechanics H 


Symbol Name Page number 
H Full Hamilton operator | 404 
Ho Free Hamilton 404 
operator 
V Interaction operator 404 
G Propagator for H 405 
Go Propagator for Ho 405 
S Scattering operator 414 
T Transition operator 415 
One-particle operator | 444 
Q Solid angle 417 
QE Möller’s wave 413 
operators 
P,Q Projection operators |413 
[15 Scattering states 412 
* o Scattering 418 
cross-section 
ô Scattering phase 421 
* Tr Level width 425 
y Annihilation operator |442 
yt Creation operator 442 
N Particle number 443 
operator 
y” Dirac matrix 489 
oH” Dirac matrix 490 


Weinheim 1980) and are marked here with an asterisk. However, one and the same 
symbol may represent different quantities in different branches of physics. Therefore, 
we have to divide the list of symbols into different parts (Table 5.3). 


References 


. H. Feshbach, Ann. Phys. 19, 287 (1962) 


schweig, 1974) 


Chap. 5 


E.P. Wigner, L. Eisenbud, Phys. Rev. 72, 29 (1947) 

A.M. Lane, R.G. Thomas, Rev. Mod. Phys. 30, 257 (1958) 
P.I. Kapur, R. Peierls, Proc. Roy. Soc. A 166, 277 (1937) 
P.A. Kazaks, K.R. Greider, Phys. Rev. C 1, 856 (1970) 
E.W. Schmid, H. Ziegelmann, The Quantum Mechanical Three-Body Problem (Vieweg, Braun- 


G.E. Brown, Many-Body Problems (North-Holland, Amsterdam, 1972), p. 22 
C. Cohen Tannoudji, J. Dupont-Roc, G. Grynber, Photons and Atoms (Wiley, New York, 1989), 


References 511 


9. E. Schrödinger, Naturwissenschaften 14, 664 (1926) 
10. O. Klein, Z. Phys. 37, 895 (1926) 
11. W. Gordon, Z. Phys. 40, 117 (1926) 
12. E. Schrödinger, Ann. Physics 79, 489 (1926) 
13. V. Fock, Z. Phys. 38, 242; 39, 226 (1926) 
14. P.A.M. Dirac, Proc. Roy. Soc. A 117, 610 (1928) 
15. R.A. Swainson, G.W.F. Drake: J. Phys. A 24, 79, 95 (1991) 


Suggestions for Textbooks and Further Reading 


16. W. Greiner, J. Reinhardt: Field Quantization (Springer, New York 1996) 

17. W. Greiner: Relativistic Quantum Mechanics: Wave Equations (Springer, New York 2000) 

18. V.B. Berestetskii, E.M. Lifshitz, L.P. Pitaevskii, Course of Theoretical Physics—Quantum 
Electrodynamics, vol. 4 ,2nd edn. (Butterworth-Heinemann, Oxford 1982) 


Chapter 6 A) 
Thermodynamics and Statistics cigit; 


6.1 Statistics 


6.1.1 Introduction 


Although this chapter is announced in the usual way as being about thermodynamics 
and statistics, we shall nevertheless begin with statistics. Then we shall be able to 
justify thermodynamics with quantum theory, and present the entropy S in a more 
logical way.! The entropy is a key basic notion in the theory of heat and must oth- 
erwise be introduced axiomatically. In such a representation, thermodynamics starts 
with the following main theorems, where the notion of state variable appears three 
times and, as an observable, is associated with the instantaneous state of the consid- 
ered system, e.g., position, momentum, and energy in particle mechanics: 


Zeroth main theorem (R. H. Fowler): There is a state variable called temperature T 
(in kelvin K). Two systems (or two parts of a systems) are only in thermal equilibrium 
if they have equal temperature. 


First main theorem (R. Mayer, H. v. Helmholtz): There is a state variable called 
the internal energy U of the system. It increases by the (reversible or irreversible) 
addition of an amount of heat 8Q and addition of work 8A: 


lt is interesting to quote Carathéodory [1] in his inaugural address to the Prussian Academy as cited 
in [2]: “It is possible to ask the question as to how to construct the phenomenological science of 
thermodynamics when it is desired to include only directly measurable quantities, that is volumes, 
pressures, and the chemical composition of systems. The resulting theory is logically unassailable 
and satisfactory for the mathematician because, starting solely with observed facts, it succeeds with 
a minimum of hypotheses. And yet, precisely these merits impede its usefulness to the student of 
nature, because, on the one hand, temperature appears as a derived quantity, and on the other, and 
above all, it is impossible to establish a connection between the world of visible and tangible matter 
and the world of atoms through the smooth walls of the all too artificial structure.” 


© Springer Nature Switzerland AG 2018 513 
A. Lindner and D. Strauch, A Complete Course 

on Theoretical Physics, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-04360-5_6 


514 6 Thermodynamics and Statistics 
dU = 8O+8A. 


Note that dU is a complete differential, while the terms on the right-hand side are 
not necessarily so. For a cycle, $ dU = 0 holds, while not all closed integrals of the 
individual quantities on the right would vanish. Therefore, U is a state variable, but 
heat and work are not, as already stressed in Fig. 2.1 (more on that in Sect. 6.4.2). 
Symbols containing § are common in variational calculus (see Sects. 2.1.2 and 2.1.3). 
For a closed system the energy conservation law holds, i.e., dU = 0. Generally, there 
are no conservation laws for heat or work alone. 


Second main theorem (R. Clausius, W. Thomson/Lord Kelvin): There is a state vari- 
able called entropy S. This increases by the reversibly added quantity 5Qrey/T, 


è Qrev 
T 


dS = 


’ 


and for a closed system it can only increase with time: 


>0, foraclosed system. 
dt 


This inequality is called the entropy law. 


Third main theorem (W. Nernst): At the absolute zero of the temperature T = 0, the 
entropy depends only on the degree of degeneracy of the ground state. There we can 
set S = 0. 


The entropy seems to many people like a mysterious auxiliary quantity. What is 
important for its measurement is the amount of heat added reversibly, and this 
depends only on the entropy, whether or not a process can be reversed in a closed 
system. It may possibly break time-reversal invariance. 

On the other hand, if we begin with statistics and derive the phenomena associated 
with heat from the disordered motion of particles like molecules, atoms, or photons, 
as described in [3, 4], for example, then we can begin by introducing the information 
entropy (the many different possibilities of expression). This can be used to justify 
the main theorems of thermodynamics. On the other hand, in statistics we rely on 
“sensible” assumptions. 

Therefore, we can already clarify the notion of entropy in this section. In Sect. 6.2, 
we introduce the time dependence and justify the entropy law. After that we will 
consider equilibrium distributions and use this to understand what entropy can do 
for us. In Sect. 6.4, we can then deal with the main theorems of thermodynamics and 
subsequently turn to applications. 

In the following, we consider systems with very many degrees of freedom (very 
many “particles”), whose individual characteristics neither can nor shall be pursued 
in detail. If we take a mole of some gas (i.e., nearly a septillion molecules), then we 
can neither solve the coupled equations of motion, nor set the initial conditions for all 


6.1 Statistics 515 


the individual particles correctly and actually follow their time evolution. In fact, we 
do not want to observe the single molecules, but only a few properties (parameters) 
of the system as a whole. We can fix the macro state through a handful of collective 
(macroscopic) parameters and follow its evolution, but not the basic micro state, 
which contains far too many microscopic parameters. (Even if only a few particles 
appear to be important, their coupling to the environment with its many degrees 
of freedom cannot be switched off completely, and this environment is continually 
changing.) 

A truly enormous number of different micro states belong to any given macro 
state, specified by its particle number and type, its energy and volume, etc. We shall 
treat these many states using statistical methods. All the micro states belonging to 
the same macro state form a statistical ensemble. 


6.1.2 Statistical Ensembles and the Notion of Probability 


A Statistical ensemble is described by a small number of parameters, while many 
other parameters vary from member to member within the ensemble. As an example, 
we have already mentioned a gas of molecules, whose energy, volume, and particle 
number have been given. Another statistical ensemble need not even assume a fixed 
number of particles. So the local particle densities in the considered gases may differ 
significantly from the mean value N/V. The different values occur with different 
sub-ensembles of particles in the ensembles. 

From the occurrence of an attribute (signature) in a sequence (of micro states), we 
may infer its probability, i.e., its relative occurrence in the limit of large sequences. 
If we consider, e.g., the results of tossing a dice, then the “6” will not always occur 
exactly once for very six throws (sometimes not at all, sometimes repeatedly), but for 
a fair dice every number z € {1, ..., 6} will occur on the average equally often. The 
probabilities p, for a fair dice do not depend on z. Summed over all possibilities, we 
must therefore obtain unity, i.e., `, pz = 1, because the probability that the event 
Z1 Or z2 occurs, is generally equal to pı + p2 (to be contrasted with the probability 
that only zı, then z2 appears, which is equal to pı - p2, and that once zı and once z2 
appears, equal to p1 - 02 + 02° Pı = 2/1 - p2). With p1 = --- = p6 and 4 pz = 
1, we conclude that the probability p; for each number of spots is equal to 1/6, for a 
fair dice. 

If z is generally a natural number which may assume Z values, and p, the asso- 
ciated probability (relative occurrence in the statistical ensemble), then p; is real, 
non-negative, and normalized: 


If z is a continuous variable, then p(z) will be a probability density, and instead of 
the sum, there will be an integral. Only p(z) dz is then a probability, namely that the 
variable takes a value between z and z + dz. 


516 6 Thermodynamics and Statistics 


The mean value of a quantity A, in an ensemble given by {p;} is clearly 


since each value A, is weighted here with its associated probability. In quantum 
mechanics (see Sect. 4.2.11), o and A are Hermitian operators, which we may rep- 
resent in a basis {|n)} as matrices. Then, 


(A) = X (nlo w) A|n) = $ (nlp Aln) = (pA) . 


nn' n 


In the eigen representation of p or A, only the diagonal elements of p and A are 
necessary, and thus only a sum over pnAn. Therefore, in the following we shall 
often write (A) = tr(p A), even though we think mostly of p and A as the classical 
quantities. 

The mean value is a linear functional, i.e., for arbitrary constant œ and 6, we have 


(œ A + B B) =a (A) + B(B), 


since tr {p (xA + BB)} = atr(pA) + Btr(eB). With (1) = 1, the mean value of the 
deviations from the mean value vanishes: 


but generally the square of the fluctuation (the variance or dispersion) will not be 
Zero: 


(AA)? = ((A— (A))’) = (A?) — (A), 


where AA > 0. We call AA the standard deviation (error width or average square 
deviation, and in quantum theory, the uncertainty) and AA/(A) the relative fluctu- 
ation: the smaller it is, the less frequently the members of the ensemble are in states 
with a value A, which deviates essentially from (A). Results of measurements will 
be given in the form (A) + AA. 


6.1.3 Binomial Distribution 


For the probability distribution {,} of a statistical ensemble of Z mutually indepen- 
dent experiments, we are often led to ask whether z of them have turned out to be con- 
venient (positive), with the remaining ones being inconvenient (negative), i.e., there 
are essentially two outcomes. This is similar to the problem of a one-dimensional 
random walk with fixed step length. Here, a body can move forward or backward, 


6.1 Statistics 517 


and in fact with the probability p for each step ahead and probability q = 1 — p for 
each step back: then we ask where it will be after Z steps? If there are z steps ahead 
and Z — z steps back, then finally it will have made z — (Z — z) = 2z — Z steps 
ahead—for z < i Z, it will thus have moved back. 

The probability that out of the Z steps the first z were ahead and the remaining 
ones back is clearly equal to p? g7~*. Here the order is not important at all, since we 
ask only for the probability that in total there were z steps ahead. As is well known, 
there is a total of Z! different ways to order Z distinguishable objects—the last can 
be placed at Z sites and therefore increases the number of ways by a factor of Z, 
while for Z = 1, there is only one site. But here only two outcomes are distinguished 
(ahead or back), and therefore Z! is to be divided by z! (Z — z)! Thus (7) different 
combinations deliver the same result. The unknown probability is therefore equal 
to the probability p? q7~* for the first-mentioned possibility times this number of 


equivalent series. We find the binomial distribution (Bernoulli distribution) 


ARES 
p= ( ) pa” i 
Z 


With Si 6) p74” = = (p + q)” and p + q = 1, we do indeed obtain >. p; = 1. 

The mean value of “convenient” occurrences is (z) = }_, pz z. Such mean values 
can often be evaluated as derivatives with respect to suitable parameters. For the 
binomial distribution, for instance, 


as expected, because the probability p is the ratio (z)/Z. We can also find the mean 
value of z? for this distribution in a similar way, by noting that (z?) is equal to 


(p 9/ap)” (p+ 4)" |p=i-q = P 9/dp {PZ (p +p- + 
which is equal to 
pZ+pZ(Z-1)= pZ +p -p)Z. 


Hence the binomial distribution yields the standard deviation and relative fluctuation 


(see Fig. 6.1) 
Az [q 1 
Az = y pqZ and = A 
(z) PNZ 


518 6 Thermodynamics and Statistics 


Oz e : Poisson distribution 


x: Gauss distribution 


Z = 20 


z 2 4 6 8 10 12 14 16 18 z 


Fig. 6.1 Binomial distributions (Bernoulli distributions), represented by bars, with 20 possibilities 
and the mean values (z) € {1, 3, 10}. For comparison, we also show the values for the associated 
Poisson and Gauss distributions 


With increasing Z, we find that Az/(z) becomes ever smaller, the maximum of {07} 
becoming sharper. For example, with Z = 10% and p ~ q, the measurement value 
is uncertain only in the tenth digit. 


6.1.4 Gauss and Poisson Distributions 


For very large Z the binomial coefficients are difficult to evaluate. Then it is better 
to use approximation formulas for the factorials of Z and Z — z, and in particular, 


Stirling’s formula 
Z! X (Z/e)* V2nZ. 


This can be proven using n! = i x" e* dx, if the exponent nlnx — x can be 
expanded in a power series about the maximum x = n (Problem 6.2). For very large 
Z, we may even leave out the square-root factor, because In /27Z « Z (ln Z — 
1) = In (Z/e)4, as also represented in Fig. 6.2. The logarithmic scale is very appro- 
priate here. 

Let us start by considering the case p < 1 (or equivalently q « 1, because then 
we need to interchange only p and q). We have (z) < Z, implying that only z « Z 
is important. Therefore, we may now approximate the binomial coefficients (2 ) with 
(1 — 2z/Z)* © 1, but (1 — z/Z)7 © e™ as follows: 


(7) sel (Z/e)7 i (2) (l=2/Z27 _ Z 
z) z! {(Z = z)/eZ = z! Le (1 — z/Z) Tal) 


6.1 Statistics 519 


10 100 Z 10 100 1000 Z 
1.00 


0.10 


0.99 
0.01 


In /27Z : In (Z/e)Z 


Fig. 6.2 Quality of the Stirling formula. The ratio (Z/ e)Z v27 Z/Z! (left) and the ratio of the 
logarithms of /27 Z and (Z Je)” (right) versus Z 


In addition, with In(1 — p) ~ —p, we may set g7* ~ e ?'4~), For z « Z, the 
factor e”? can be neglected in comparison to e~?“. Consequently, for Z >> 1 and p< 
1, with (z) = pZ, the binomial distribution goes over into the Poisson distribution 


pe = exp(—(z)) E. 
Zs 


Since yo (24 z! tends to e) for Z >> (z), the normalization is conserved, despite 
the approximations. In addition, for q ~ 1, from the standard deviation of the bino- 
mial distribution, we now obtain (Az)* = pZ = (z) and likewise from the Poisson 
distribution. 

The Poisson distribution always occurs if there are a great many possibilities, but 
only a few are actually realized, e.g., for the probability of weakly coupled quanta 
striking the atoms of a multi-layered lattice, or for the clump probability in a beam 
of mutually independent particles, where we may ask how soon one quantum is 
followed by the next and we refer to the average distance. (The sequence is no longer 
independent, if the quanta occur preferably in pairs or single.) 

So far, with p < 1, only z < Z was important, or equivalently, for q < 1, only 
z% Z > 1 was important. If neither p nor q are very small, then these boundary 
values are no longer relevant. We may then take z as a continuous variable and expand 
In e(z) in a Taylor series about the maximum (z). If we use the Stirling formula for 
the factorials in 3. then we obtain (Problem 6.4) the Gauss distribution, also called 
the normal distribution (see Figs. 1.15 and 6.1): 


OE a 
J/2n Az 2(Azy ` 


Here we always have (z) = pZ and Az = 4y pq Z. Instead of the error width Az, the 
Lorentz distribution is sometimes taken, i.e., the interval, in which p(z) is greater 


520 6 Thermodynamics and Statistics 


Table 6.1 Correlation 

between people’s size and 
weight in terms of thin (o) 
and thick (e) Short o ° 


Size Light Average Heavy 
weight 


than or equal to half the maximum value. For the Gauss distribution, it is 2/In4Az © 
2.35Az. 

What is important in all these examples of probability distributions is the result 
that the relative deviation from the mean value with increasing Z becomes ever 
smaller, and in the limit Z >> 1, the uncertainty Az becomes negligible, because we 
can only give the mean value (z) to a few significant figures. 


6.1.5 Correlations and Partial Systems 


We usually consider several observables and investigate how they are connected to 
each other. We restrict ourselves here to two quantities A and B. Their deviations 
from the average value may be correlated to each other, e.g., people’s height and 
weight (see Table 6.1). 

A measure for such correlations is clearly 


Kap = ((A— (A))(B — (B))) = (AB) — (A) (B) , 


which can be usefully related to the fluctuations AA and AB. A better measure is 
the normalized correlation or correlation coefficient 


Kap (AB) — (A)(B) 


AA-AB  AA-AB 


KAB = 


The fluctuation (squared) (AA)? is thus equal to the auto-correlation K 44, and kK 4, 
is 1. While K 4g may be negative, in which case we speak of an anti-correlation, this 
is not possible for the auto-correlation. Note that, in quantum mechanics, we have 
(AB) +Æ (BA), if the operators A and B do not commute. Then, according to p. 326, 
for the correlation, we often use the symmetrized product 5(AB+BA) and takes 
Kag = 5(AB+BA) — (A)(B) as the correlation coefficient. 

If several independent variables z“,..., z“ occur, we combine them into a 
vector z and consider p(z). We shall soon see that o(z) may be written exactly as 
a product pz) --- o™(z™) if there are no correlations between observables, 
which are only related to different variables. 

In particular, if we take a property A”, for which only the ith variable z® 
is important, then we may immediately sum over all other variables z“*') in 
(A) = $, p@)A(z), because A“ (z) does not depend on them. With this sum, 


6.1 Statistics 521 


p(z) becomes a function of z alone, and in fact p (z), if o (z) factorizes and use 
is made of > p(z) = 1. 
For i Æ k, we then have 


(AOA) = > pP EO, 2) 49 (29) AH ZV) S 


20 2 


If p“®(z, z™) factorizes, this is equal to (A) (A), so the mean value of the 
products is the product of the mean values. Conversely, if all correlations vanish, 
then the probability factorizes. 

If a system can be decomposed into mutually independent parts, then there are no 
correlations between them, and its probability can be broken down into the products 
of the individual probabilities, one for each part. 


6.1.6 Information Entropy 


To each probability distribution {p,}, we assign an “information measure” J > 0. It 
vanishes if the same thing always happens, i.e., if only a “boring ” case z’ is always 
realized, thus if o, = 6,,,. The more there are other possibilities that can be realized, 
the more information can be transmitted, the sooner there will be a rare message, 
and the greater will be the uncertainty concerning the present event. As information 
measure, we take the number of yes—no decisions with which, for a given distribution 
{oz}, on the average, one of the possibilities can be determined. This information 


measure is 
f=- 5 pz|bp, ’ 


where lb denotes the binary logarithm, i.e., to the base 2, defined by 


Inx 


gibx T 
lIn2 ` 


= x, whence lbx = log,x = 


Occasionally, ldx is used instead of lbx, referred to as the logarithmus dualis. The 
unit of information is the bit (binary number). For example, a set of 32 = 2° playing 
cards contains 5 bit of information, as we shall see soon. 

However, the information measure J only evaluates how rarely an event occurs, 
but does not account for its worthiness, in the sense of how much it is worth to us. 
The playing cards have different values for the different team members, but each 
contributes an information content of 5 bit, and a row with 100 arbitrarily chosen 
letters (and punctuation marks) has the same information content as an equally long 
piece of prose or verse. Since there may be overwhelmingly many “misprints”, I is 
often called a measure of disorder. (It is interesting to note that, in written texts, the 
letters do not not all occur with equal probability. In German, the information content 


522 6 Thermodynamics and Statistics 


of one of the 26 letters of the alphabet, together with the space, is not 1b27 = 4.76 
bit, but only approximately 4 bit.) 

In order to understand that the given sum achieves what is required of it, we 
proceed stepwise. First, we restrict ourselves to Z equally probable possibilities, 
whence p; = 1/Z. For Z = 2” possibilities, clever questioners after each response 
drop half the remaining possibilities: after m responses, they know which of the 2” 
possibilities actually exist. Here we thus have J = lbZ. If Z is not a power of 2, 
we do not always need the same number of questions. For Z = 3, in a third of all 
cases, one question suffices. Then with the second question, we could already check 
the next attribute. Here the information measure for the questions for two attributes 
has to be additive: if the first attribute Z; has equally probable possibilities and the 
second Z> likewise, then in total there are Z = Z;Z> equally probable possibilities, 
and we must have [(Z,Z2) = I (Z1) + I (Z2). This requirement is fulfilled only by 
the function Z (Z) = c ln Z, where the factor c cannot depend on Z, and clearly has 
to be equal to 1/ 1n 2, so that everything is correct for Z = 2”. For p, = 1/Z, we do 
indeed find the above-mentioned expression, because —Z(1/Z)lb(1/Z) = IbZ. 

The additivity of the information measure for independent attributes must also 
be valid for other distributions {p, 4 1/Z}. For these, we take the largest common 
divisor 1/Z’ of all fractions p, and start from a total of Z’ equally probable events 
which we combine into Z groups, each with N; = p; Z’ members (see Figs. 6.3 and 
6.4). The information measure IbZ’ may then be composed of two terms: one is the 
unknown auxiliary quantity 7, and measures the information which is related to the 
characterization of the group z, while the other rates the information in this group 
and clearly has the value IbN,. Then IbZ’ = J, + IbN-, and with N,/Z' = p-, this 
delivers the expression J, = —lbp,, while its mean value gives the unknown variable. 
Thus on the average 7 = — )°. p,lbp, questions are indeed necessary before reaching 
the final decision. 

As p; — 0, the quantity Ibo, increases beyond all bounds, but nevertheless so 
slowly that p,lbp, — 0. We do not need to question completely improbable possi- 
bilities as they do not contribute to the uncertainty—physicists often speak of frozen 
degrees of freedom. 


Fig. 6.3 Information measure for {p1 = zs p = 5; B= a}. This is indicated here by the upper 
probability distribution. The problem can be mapped onto Z’ = 6 equally probable cases, whence 
the steps turn into a single bar of equal area. With the additivity of the information measure, it then 
follows that lb6 = Jı + 1b2 = h + 1b3 = J; + lb1 and hence J, = —lbp, 


6.1 Statistics 523 


Fig. 6.4 Information T 
entropy / for the binary 1.0 
system. It has only two j 
states, and therefore 

pı + p2 = 1. Hence, I may 
be represented here as a 
function of pı. Note the 
steep slope for p1 ~% 0 and 
p2 © 0. The uncertainty is 
greatest when the two states 
are occupied with equal 
probability (o1 and p2 both 


gual to- 1/2) 0.6.0 0.5 1.0 2 


In thermodynamics, instead of the information measure 7, we use the information 
entropy 


S = (kin?) I =k} ponp, 


where k is the Boltzmann constant, already mentioned in the list of fandamental 
constants on p. 623. Note that we prefer the natural logarithm In x, because it can be 
differentiated with respect to x more easily than lbx. With O < p, < 1, the entropy 
is never negative. It vanishes if only one state is occupied, and takes its largest value 
if all possible states are equally probable (Problem 6.6). 


6.1.7 Classical Statistics and Phase Space Cells 


The notion of entropy just introduced is useful only for countable attributes z. This 
is because p; has to be dimensionless, given that we cannot take a logarithm of a 
probability density. This means that continuous variables have to be discretized. We 
shall investigate this more precisely for the probability density p(x, p). 

According to Hamiltonian mechanics, a system of N point masses is completely 
determined if their positions and momenta are given. This therefore means specifying 
6N quantities. Classical N-particle systems will be represented by a point (x, p) in 
the 6N-dimensional phase space. (This is also called the larger phase space or T- 
space, the generalization of the 6-dimensional phase space of a single particle, which 
is also called u-space. In u-space, N points are occupied.) The vectors x and p each 
have 3N components. 

We are concerned here with statistical ensembles and therefore assign a probability 
density p(x, p) with the following properties to each phase space point: 


p(x, p) =% p) > 0, [ow p) OX BXp = 1, 


i.e., (x, p) is real, non-negative, and normalized. 


524 6 Thermodynamics and Statistics 


Using this, the mean values of the quantities A(x, p) may be evaluated from 


(A) = J p, p) A(x, p) PX dp . 


Here arbitrary canonical transformations (x, p) <> (x’, p’) are allowed, i.e., those 
ensuring dx dp = dx'dp’, because we require p(x, p) = p'(x', p’) and A(x, p) = 
A'(x’, p’), according to Sect. 2.4.4. 

In quantum theory, this is true if we take the Wigner function as the density (see 
p. 324). However, it is sometimes negative. This disadvantage can be avoided with 
the density operator (see Sect. 4.2.11), hence with (A) = tr(pA). (In the position 
representation, this is equal to f (x| |x’) (x’|A|x) d?™ x d?"x’, and in the momentum 
representation, to f(p|p|p’)(p'|Alp) dN p a?’ p’. In contrast, the Wigner function 
uses x and p, even though they cannot be sharp simultaneously.) The density oper- 
ator is Hermitian, non-negative, and normalized. Here, unitary transformations U 
are also permitted, so instead of the position representation, the momentum or any 
other representation may be used. If we have p’ = Up U`! and A’ = UAU™|, then 
tr(AB) = tr(BA) implies tr(p’A’) = tr(p A). 

We shall now divide each continuous variable x, p into equal sections 5x and 8p, 
so that the phase space is divided into cells of size (8x 8p)*% . The smaller these cells, 
the more precisely the states are determined. Here, in the classical description, the 
cell size may be arbitrarily small, while in quantum physics, according to Heisen- 
berg’s uncertainty relation, position and momentum cannot both be arbitrarily sharp, 
because Ax - Ap > sh. In fact, only for 


òx- õp=h = 27h 


do classical and quantum mechanics yield the same number of states. We shall now 
show this for free particles in a cube. Another example is given in Fig. 6.5 (or 
Problem 6.7), namely for harmonic oscillators. 


Fig. 6.5 Phase space cells partition the action variable (with the phase integrals J = $ p dx, as 
discussed on p. 136). They lead us to the action quantum. In addition to linear cell boundaries (as in 
Fig. 2.28), curved ones are also possible. Thus polar coordinates are appropriate for an oscillation. 
If the phase angle is completely unsharp (as on the time average), then for suitable scale factors, 
the phase space cells are concentric circular rings of equal area 


6.1 Statistics 525 


In a cube of side L, according to quantum theory (p. 355), the Cartesian compo- 
nents of the wave vector have eigenvalues k, = na/L withn € {1, 2,...}. Only the 
wave function ./2/L sin(k,x) vanishes at the container walls x = 0 and x = L. The 
number of one-particle states with momenta p < pp = hkp = np n/L = sh np/L 
is thus equal to the number of unit cubes in the octant with radius np = 2h7'L pr: 


l4r , 4V , 
= np = 3 PF . 


Q= = 
8 3 3 h 


If we divide the phase space volume tr pr? V into cells of size h°, then we have 
as many cells as states according to quantum theory, and we shall exploit this in the 
following. 

We recognize here the meaning of the Planck constant h for thermodynamics. 
While the classical description for discretization remains completely undetermined, 
the quantum-mechanical uncertainty relation supplies a unique cell size in phase 
space. So we are not dealing with an uncertainty relation, a name that can be con- 
sidered less appropriate. 


6.1.8 Summary: Statistics 


In statistics, we consider ensembles in which the Z possibilities occur with prob- 
abilities p,. The probability distribution {p;} satisfies the constraints p, = p,* > 0 
and )°. pz = 1. (For continuous z, integrals occur instead of sums. Nevertheless, 
according to quantum theory, the phase space cells have size òx 5p = h and we 
may discretize.) The observable A in the statistical ensemble has the average value 
(A) = tr(pA) and the uncertainty (error width) AA = y (A?) — (A). Two quanti- 
ties A and B have the correlation K4g = (AB) — (A)(B). With such correlations, 
we can determine whether mutually independent variables occur in the statistical 
ensemble. If this is the case, then the probability distribution may be factorized into 
a product whose factors each depend only on one of the variables. Important for the 
following is also the information entropy S = —ktr (p In p). Disregarding the factor 
k In 2, this gives the average number of yes—no decisions with which one of the pos- 
sibilities for the given probability distribution {o,} may be determined. This entropy 
is one of the most important parameters characterizing the statistical ensemble. 


6.2 Entropy Theorem 
6.2.1 Entropy Law and Rate Equation 
The information entropy must satisfy the extremely important entropy law: 


Geo for all closed systems. 


526 6 Thermodynamics and Statistics 


This is also called Boltzmann’s H-theorem, because instead of the entropy, Boltzmann 
used the upper-case Greek letter Eta, which resembles the Latin letter H, and he 
defined H = tr(p Inp) = —S/k. We shall avoid this quantity here, because it could 
be confused with the enthalpy, which, according to international recommendations, 
should be abbreviated with the Latin H. Here a system is called closed, if it is not in 
contact with the environment, whence it exchanges neither energy nor particles, nor 
anything else. Therefore, in addition to invariable macro parameters, only its entropy 
depends on the time (or the probability distribution, which for its part does depend 
on the external parameters, and the time). We shall only allow for changes in other 
macro parameters at the end of the next section. 

As will be shown in Sect. 6.2.3, this inequality follows from the rate equation for 
the probability (also called the balance or master equation), demonstrated in Sect. 
4.6.4: 


doz 
Se NW pa Ws 
2i 2c! Pe — Wez pz) 


where Wy; (> 0) gives the transition rate from the state z into the state z’. Note that, 
as in quantum theory, the final state is also on the left of the initial state here. On 
p. 383, Wz, œ |(z’| H |z) | for z’ Æ z was already determined. Such rate equations 
are often set as an ansatz (further examples in the second part of this section), which 
should not be confused with the (entropy conserving) Liouville or von Neumann 
equation, which we shall discuss in Sect. 6.2.3. The term )°., W-z pz is the yield rate 
and )°., Wp, the loss rate for the state z, and the balance depends on both. 
As a rate equation, we may also take the diffusion equation 


dp 
—=DAp, 
ot £ 


as we shall now show in one dimension, in particular, with 3? o/z? instead of Ap. 
Thus we discretize the position parameter z of the cell with size 5z and obtain a 
connection with the neighboring cells: 


dp, _ p +H 202+ pPz—1 
dt (8z)? 


The transition rate W and the diffusion constant D are related by 


Wey = 82,241 D/(8z)? = Wee , 


and from W,,, > 0, it then follows that D > 0. 

While open systems may prefer the transition in one direction (for example, they 
transmit energy to a colder environment), for closed systems, W.,, = Wyz. Therefore, 
the rate equation for closed systems simplifies to 6, = )°., Wav (pz — pz). Hence, 


ds 
dt 


d(pz In pz) dpz 
=) = kn + =k Wer (0e — py np: + 1). 


6.2 Entropy Theorem 527 


If we now swap the summation indices (z <> z’) and add the expressions, we obtain 


ds 


2 ae = k» Wx (p: — pz) (In p: — In px) . 


With p, > py, we also have ln p, > In py, so there are no negative terms here. The 
entropy law thus follows from the rate equation if the transition rates Wy, and W,,, 
are equal, and this applies to closed systems. 

The entropy increases until it has taken the largest value compatible with the 
remaining constraints. In particular, the rate equation does not change at all if 
W.2 py = Wyz pz holds for all pairs (z, z’). In this situation, the system is said to be 
in detailed equilibrium. 


6.2.2 Irreversible Changes of State and Relaxation-Time 
Approximation 


If the entropy of a closed system has increased in the course of time, then according to 
the entropy law, it never ever returns to the initial state by itself, because the entropy 
would have to decrease again. The change of state is thus not reversible, and is said 
to be irreversible. 

We take a two-level system as the simplest example. We already investigated its 
rate equation in Sect. 4.6.4. With p1 + p2 = 1, it may be decoupled to yield 


bi = Wn po — Wa pı = Wi — (Wi + Wai) pr , 


whence it has the solution 
=f 
p(t) = Wnt + {01 (0) — Wit} exp = 


with the relaxation time 


1 
T=, 
Wi2 + Wai 


In quantum physics, t is called the average lifetime. It is occasionally replaced 
by the decay time Ti = t ln 2, because 5 = exp(t/2/T). It is a measure of how 
fast equilibrium is reached. The more strongly the two states are coupled to each 
other, the faster this happens. The solution pı approaches the limiting value W27 
monotonically, and p2 = 1 — p, the value W2,7. The value with the highest entropy 
(here 1/2) is reserved for a closed system, in particular, when Wi? = W21. 

For Z states, we modify the rate equation into a linear system of equations 6, = 


X az» Pz, Where 


azz» = Wrz , for z7#2', 
La age Were, for z=. 


528 6 Thermodynamics and Statistics 


The sums of the columns in its coefficient matrix (azy) are all zero, whence two 
important properties follow. Firstly, the determinant of this matrix must be zero, and 
therefore there is a zero eigenvalue, hence a stationary eigen solution. The second 
property follows because only the diagonal elements are actually negative: all eigen- 
values have a non-positive real part. According to Gerschgorin’s theorem [5], the 
position (in the complex plane) of the (suitably ordered) kth eigenvalue of a complex 
matrix has a distance from the kth diagonal element which is less than the sum of 
the moduli of the non-diagonal elements of the kth column. If the transition rates for 
inverse processes are equal (as for each closed system), then the matrix is Hermitian 
and thus has only real eigenvalues, which we set equal to —t%;~! (each tọ is then a 
relaxation time). We presume in the following that the eigenvalue 0 is not degener- 
ate, otherwise there may be different final states. Then the solutions p,(t) of the rate 
equation each consist of a constant term p,(co) and Z — 1 terms Czę exp(—t/T,). 
After a sufficiently long time, only the largest value of the t is important, which we 
now denote by Tt: 


—t 
p(t) re pz(Co) + cz se era 


In this relaxation-time approximation, the factors c, are determined by the initial 
state. If it differs only little from the final state, we may approximate by setting 
Cz X pz (0) = p: (©0). 

The stationary final state is given by 6, = 0 (for all z). With 


X azz py (oo) =0, X osl, 


at 


it may be traced back to the adjoint A; of the matrix (azz) of coefficients. Then, up 
to the sign (—)***, the adjoint A;y is the sub-determinant (or first minor) generated 
by eliminating the zth row and z’th column, and therefore det a = 0, azy Azz’: 


where here z’ may be chosen arbitrarily. For W,,, = Wyz, the matrix (a,,’) is also 
symmetric, and therefore }°. azy = 0 implies that })., azy = 0 and all p,(00) are 
equally large. 

Radioactive decay corresponds to an open system. The decay products move away 
from each other and never recombine. Therefore, there is in fact a transition mother 
— daughter, but not vice versa. From the differential equation 6 = —/t for the 
probability of the mother state, we obtain the solution p(t) = p (0) exp(—t/t). Note 
that the solution for the final state can be broken up into three factors: two for the 
decay products and one for the relative motion. According to p. 525, a great many 
possible states with energies between Ep — IdE and Ep + dE belong to this third 
factor, in fact, 47r Vh-?m./2m Er dE, implying therefore a high entropy. 


6.2 Entropy Theorem 529 


Fig. 6.6 Time dependence Oz 
of the stepwise decay 
1— 2—3, and in fact here 83 
for t2 = 311 
0.0 
0 5T1 107, t 


For a stepwise decay, we have to set up the following system of equations for 
the probabilities of the radiating substances and the final products, once again with 
Wee F Wrz: 


P Pl : Pl p2 : p2 
P S57, PIS R B=, 
Ti Ti T2 T2 
with the solutions 
—t 
pıt) = pı(0) exp — , 
Ti 
T2 


—t 
mi) = CORO ESE 
ty = 2 T2 


P3(t) = pı (0) — pı (t) — p2(t) , 


if we restrict ourselves to pı (0) = 1, and therefore o2 (0) = p3(0) = 0 (see Fig. 6.6). 
(But note that, with ti = t2, we have p2(t) = pı (t) t/t.) According to the above, 
we have p3(00) = pı (0). But this does not mean that the entropies of the initial and 
final states were equal, because once again the relative motion is missing, and this 
would lead to an increase in the entropy. 


6.2.3 Liouville and Collision-Free Boltzmann Equation 


In classical mechanics, we label each N -particle system by a point in the (larger) 
phase space and a statistical ensemble of such systems by a swarm of points with the 
probability density p(t, x, p). The single points move in this space as time goes by, 
but their total number remains constant. We then have the Liouville equation 


3N 


dp dp dp x | Op . 
ee Oe ole fuss =0. 
dt Lae Mees a 


We proved this in Sect. 2.4.4: a volume element in the phase space keeps its probabil- 
ity if it follows the equations of motion (by swimming along the particle trajectories, 


530 6 Thermodynamics and Statistics 


as it were)—as for an incompressible liquid. Its shape can in fact change, but not 
its content. Recall also that, according to p. 342, the von Neumann equation is the 
quantum theoretical counterpart of the Liouville equation. This is even more general 
than the (time-dependent) Schrödinger equation, because it holds not only for pure 
states, but also for mixtures. 

Under special conditions, the Liouville equation is also called the collision-free 
Boltzmann equation, in particular, if there is a swarm of interaction-free molecules, 
which cannot therefore collide. Then the probability distribution p(t, r, p) of one 
molecule suffices, because any other will have the same distribution. Note that, since 
there are no correlations, the probability distribution of the gas factorizes even if not 
all the molecules have the same mass, although then that will appear differently in 
p(t, r, p). Momentum changes may be traced back to an external force F = p via 


ð 
(È +v: v, +F- Vp) perp =0; 


Note, however, that the canonical momentum then has to be equal to the mechanical 
one, but charged particles would also interact with each other. If we take the veloc- 
ity v instead of the momentum p, then setting Vv = a, we obtain the collision-free 
Boltzmann equation 


ð 
(È +v- v +a- V.) ny =0, 


which in plasma physics is also called the Vlasov equation. For a = 0, it is solved 
by any function p(r — vt, v). 

For all these examples, the total entropy is conserved if there is no friction force. 
(Actually, as mentioned before, we cannot take a logarithm of a density, because 
it carries a dimension. But we may divide the phase space into cells and associate 
probabilities with them.) With 0(p In p)/dt = (ln po + 1) 3p/ðt, the collision-free 
Boltzmann equation delivers (p In )/dt = —(v- V, +a- V,)o1n p, and therefore 


aS 
dt 


a(ol 
kf eee drdv=k fw-V, +a: Vo pinp dr dv. 


Since the velocity cannot be arbitrarily high, the surface integral of a p In p in the 
velocity space vanishes. Therefore, Gauss’s theorem supplies 


fav eine do=- | omp Y, adv. 


Since the external force, and hence the acceleration a, should not depend on the 
velocity, the last expression vanishes. For a friction force, the situation is different, 
but this can be traced back to collisions which we will account for only in the next 
section. 


6.2 Entropy Theorem 531 


In order to determine a local change in the entropy, we integrate the further terms 
only over the velocity. Since r and v are mutually independent variables, we find 


fry ompdv=v,; fv pinp ay. 


The entropy may thus change locally, but not globally, because then according to 
Gauss’s theorem, the surface integral would have to be investigated. But for r —> oo, 
the factor p In p is zero. 


6.2.4 Boltzmann Equation 


We now consider an example in which the entropy can increase with time. If 
molecules of equal mass collide, then further terms appear in the Boltzmann equa- 
tion mentioned above, which describe the collision-induced gain and loss of the 
probability density p(t, r, v): 


ð 
(Fave vta- V) pnw = RR. 


This relation is also more general than the rate equation initially considered, because 
in fact do/dt stands on the left, while on the right the gain and loss have not been 
split-up into transition rate and density. This will be done later. 

We evaluate the new terms using the following approximations. Firstly, we account 
for collisions between only two particles and restrict ourselves to time spans during 
which a molecule collides at most once. Both assumptions presume a sufficiently 
low density. Secondly, we neglect the influence of the container walls, which is justi- 
fied for sufficiently large systems. Thirdly, we restrict ourselves to elastic scattering 
(point-like collision partners without internal degrees of freedom). The differential 
scattering cross-section o may depend only on the velocities. Finally, in addition 
to energy and momentum conservation, we also make use of space-inversion and 
time-reversal invariance: 


o (V1, V2 > Vj, V5) = o (—V1, —V2 > —Vj, —V5), r—>-r, 


o(—-V\,-V5 > —-V1,-V2), t—o-t. 


Then the scattering cross-sections for inverse collisions are equal, 
o (Vi, V2 > Vi V2) = o (V1 Y3 > V1, v2), 
something we shall use to establish the relation between R, and R_, or to establish 


Wz» = Wy. Due to energy and momentum conservation, vı and v2 already fix vj 
and v, except for the direction of the relative velocity. For the proof in the next 


532 6 Thermodynamics and Statistics 


section, this is of no help. Instead of f o dQ, it is better to write f o (V1, V2 > 
vi, vL) dvi d°v}. Then o is not actually an area, but it is probably not appropriate 
to use another letter. 

Here the decrease in the probability density p(t, r, vı) is the product of the scat- 
tering cross-section and the current strength, which themselves may be calculated 
from the probability density and the relative velocity: 


R_(t,r, vi) = foo > vi V2) E(t, r, Vi, V2) |vi — v2 | dv dvi dv; ; 
For the gain in the probability density, on the other hand, we obtain 
R(t, r, vı) = fom. V5 > Vi, V2) p(t. r, V, V5) |v, — v| duz d°vi aĉo} . 


Since the scattering cross-sections for inverse collisions are equal and the energy is 
conserved, whence also |v, — v5| = |vı — V2], this may be reformulated as 


3 3 3 
R(t, r, vı) = foo > vi V2) P(t, r, Vj, V2) [V1 — Vol d*v d*vi dv; . 


Finally, we obtain 


a 
(> +v: V, +a. Vn) p(t, r, vı) 


= f [v1 — v2] o (V1, V2 > Vi, V2) {0 t, r, Vi, V2) — P(t, r, V1, V2)} 
Pu du! dur’. 


On the left is the unknown probability distribution for a single particle, and on the 
right the unknown probability distribution for two particles. This equation is soluble 
only by a further approximation, derived from the assumption of molecular chaos: 
the probability distribution of two particles (at time ¢ and at the same position r) is 
assumed to factorize, the velocities of the colliding molecules being assumed not 
to be correlated (such a factorization was already assumed in Sect. 4.6.1 in order to 
arrive at a calculable expression for the dissipation in quantum-mechanical systems): 


p(t, ¥, Vi, V2) = p(t, r, v1) - p(t, Yr, v2) . 


In this situation, we obtain a non-linear integro-differential equation known as the 
Boltzmann equation (Boltzmann transport equation) 


6.2 Entropy Theorem 533 


ð 
G FYI V, +a: Va J6 vD 


= / |v —V2 | o (v1, V2 > Vj, V5) {e(t, r, v1) e(t, r, v3) — p(t, r, V1) p(t. r, V2)} 
dup dev; d?o. 


The collision integral on the right-hand side can usually be further simplified by 
exploiting energy and momentum conservation (see the previous page). We have thus 
derived a balance equation and traced the transition rates back to known notions. 

Note that the Boltzmann equation may be used to describe a range of different 
transport processes, e.g., in reactors, superfluids, or stars [6]. 


6.2.5 Proof of the Entropy Law Using the Boltzmann 
Equation 


In order to investigate the influence of the collision integrals on the entropy, we begin 
by excluding external forces (a = 0) and assume that the probability density does 
not depend upon the position, so that only p(t, v) appears. We then have 


S(t) = -f p(t, v) Inp(t, v) dv 


and 


dp 3 
=f — {I 1} d” 
f2 timp + } dv 


1 dS 
k dt 
=fr: — v2 |o (V1, V2 > Vj, V5) {o (t, vi) p(t, V5) — p(t, v1) P(t, v2)} 


{In p(t, v1) + 1} dvi du dv; dv . 
With the symmetry of the collision partners 1 and 2, this may also be written as 


2 dS 


-7 Sfm — v2 |o (V1, V2 > Vj, V2) {o (t, v1) P(t, V5) — p(t, v1) p(t, V2)} 


{In (p(t, V1) p(t, V2)) + 2} Bu; dvr dv do . 


Since inverse collisions have scattering cross-sections equal to the original ones 
and since the modulus of the relative velocity remains conserved, we may swap the 
primed and the unprimed velocities, and then, as on p. 527, infer dS/dt > 0. 

If the probability density also depends on the position, we have to respect the addi- 
tional term f v- V, pInp d?r d?v. As shown in the section before last, the entropy 
may then change locally, but not globally. Likewise, an external force F(r) would 
change nothing in the result. 


534 6 Thermodynamics and Statistics 


The Boltzmann equation can be used, not only to prove the entropy law, but even 
to evaluate the entropy gain, provided that the scattering cross-section is known. It 
originates uniquely from the change in the states under collisions. There can be no 
entropy gain without collisions. 

It is well known that the usual basic equations of mechanics and electromagnetism 
do not change under time reversal. To each solution of the basic equations belongs 
a “‘time-reversed” solution, for which everything proceeds in the reverse order, i.e., 
t is replaced by —t. In particular, elastic scattering is invariant under time reversal, 
and this has even been used explicitly. Nevertheless, the entropy of a closed system 
may only increase with time, never decrease. 

In reality, there is no contradiction. In fact, we evaluate the entropy using another 
distribution function than the one actually planned for the (time-reversal invariant) 
Liouville equation. We describe the system with its vast number of degrees of free- 
dom using only a small number of variables, average over the remaining ones, and 
thereby lose the time-inversion symmetry. This shows up, e.g., in the derivation of the 
Boltzmann equation. Here the entropy changes, because we have assumed molecular 
chaos—by doing this, we have averaged out possible correlations and lost informa- 
tion! Actually, the one-particle density is related to the two-particle density, this with 
the three-particle density, and so on. Collisions couple the one- to the many-body 
densities. But in order to be able to proceed at all, we have to terminate this sequence 
somewhere and come back to molecular chaos. 

Although these considerations were initially applied only to the calculated entropy, 
the question remains as to whether they might not also apply to the experimental 
quantity, if the entropy is used as a state variable like, e.g., energy or volume. In fact, 
we always adopt only a few state parameters, far too few to be able to describe a 
system microscopically. This will become clear in the next section. 

If the allowed states are all equally probable, the return probabilities of a many- 
body system (N > 1) are unbelievably small. If, for example, each particle is inde- 
pendent of the others and equally probable in both halves of a container, then all N 
particles are in the one half only with the probability 27™, thus for N = 100 only 
with the probability 107%? (see Problem 6.10). 


6.2.6 Molecular Motion and Diffusion 


In order to investigate the influence of correlations in more detail, we consider a 
gas at rest, consisting of molecules of the same kind. Then (v) = 0 holds as the 
ensemble average and also as the time average. Note that an ensemble is said to be 
ergodic, if its ensemble average is equal to its time-average value. But (v? } is not 
zero. According to the equidistribution law on p. 559, the average kinetic energy 
per degree of freedom for the absolute temperature T is SkT. We shall allow for 
motions along a straight line, in a plane, or in space. Therefore, let n be the number 
of dimensions. Consequently, (v? } = nkT/m. 

Collisions alter the velocity of a test particle and lead to an irregularly fluctuating 
acceleration a around the mean value zero. Then the auto-correlation function of the 


6.2 Entropy Theorem 535 


velocity (v (t) - v (t’)) for t = t’ is in fact equal to (v?) > 0, but for |t — t’| > œ, it 
surely approaches (v(t)) - (v(t’)), i.e., it must approach zero. We set 


(v(t): v(t’)) = (v?) xt- r), 


with x (t—t’) = x (t'—t), x (0) = 1, and x (co) = 0. Up to the first collision, x keeps 
the same value, because until then the velocity does not change. Thus we shall assume 
now that each individual collision proceeds very fast (an assumption we drop in the 
section after next), and the initial and final velocities will no longer be correlated. 
The probability of a collision is (supposedly) equally large for equal timespans. If 
we call the average time up to a collision t, then we have 


x(t) = exp = 


This t does indeed correspond to a relaxation time. On average, in each time span 
Tt, the same fraction of the original attributes is removed. 

If we choose the origin at r (0), then from r (t) = to dt' v(t’) and x(t — t') = 
x(t! — t) = exp(—|t — t'|/t), we find for the squared fluctuation 


r-(t)) = fefe (vaN vE) =2 w » far [ dt” x(t —t") 


=2 (v?) t > (Ž-1+exp =>). 


For |t| < t, this Ornstein—Fiirth relation goes over into (r?°) ~ (v?) t?, and for t >> 
T, into (r?°) ~% 2(v?) tt, and both are easy enough to understand: up to the first 
collision, we have r = vt and thus (r?) = (v?) t°, but after many collisions (r?) 
increases only in proportion to ¢ (see Fig. 6.7). This is the same for random walks 
and for diffusion, as we shall now show. 


Fig. 6.7 Ornstein—Fiirth 
relation. Distance of a gas 
molecule from its initial 
position as a function of time 
(continuous red curve). For 

t > Tt, the approximation 
t/t holds, represented by 
the dashed blue parabola 


t/t 


536 6 Thermodynamics and Statistics 


For the random walk, we assume that the test body after each collision moves 
along a new direction which is not correlated with the direction prior to the collision. 
For N collisions therefore, using r = ae 5; e; with (e; - ez) = ôig, we obtain the 
expression (r?) = DEX (s?) Here (s;?) = (v?) (t;7) and (¢;7) = 2r? is independent 
of i, so (r?) œ N and hence proportional to the total time. 

This squared fluctuation also increases in accordance with the diffusion equation 


dp 

—=DAp, 

ot E 
hence linearly with time. In particular, if we set the initial value p (0, r) = ô(r), then 
for n dimensions, the solution of this differential equation (Problem 6.9) reads 


exp{—r?/(4Dt)} 
~ 4r Dt ý 


and with p(0,r) = f(r), p(t, r) = f dr’ f(r’) exp{—|r — r'?/4DA/V 4x Dt" 
then solves the diffusion equation (see Fig. 6.8). 

From this we obtain (r?) = 2n Dt. Comparing with the expression (r?) ~ 2(v*) tt 
derived above, we arrive at nD = (v?) t. The relation (v?) = nkT/m is generally 
used: 


p(t, r) = 


The diffusion constant D is thus related to the relaxation time t, where the mass of 
the test particle and the temperature of its environment are also involved. 

As already mentioned, the result (r?) œ t can be valid only for sufficiently long 
times, because up to the first collision, (r?) œ t? has to hold. We could also have 
derived the relation (r?) = 2n Dt for all t > 0 by using the ansatz (v (t) - v(t’)) = 
2nD 6(t—t’). Although we also make the ansatz for the auto-correlation function 
as a delta function, it is only an approximation. The diffusion equation has to be 
improved at the outset. Only the differential equation (improved diffusion equation) 


Fig. 6.8 One-dimensional VDrolt, x) 
diffusion. Shown is the 
distribution function 

J Dt p(t, x) as a function of 
x/J/Dt at the times t = Dt 
(red curve), ir (blue curve), 
and T (green curve). For 

t — œ, we find p > 0 


6.2 Entropy Theorem 537 


Op 


l-e'/")DA 
a (dl-e’*) p 


is solved, for the initial condition p(0, r) = ô(r), by 


exp(—r?/4Dt’) 
V4n Dt" 


and this leads to the Ornstein—Fiirth relation and to (v?) = 2nD t' ~ (v?) t?. 

These considerations are also valid for Brownian molecular motion, where an 
inert particle is struck by much faster ones. However, its velocity in this collision 
does not change as much as above and its relaxation time t is therefore very much 
longer than the average time between two collisions. 


p(t,r)= with t =t- (l =e), 


6.2.7 Langevin Equation 


In the preceding section, we determined (r?(t)) with a time-dependent probability 
density p(t, r). This corresponds to the Schrödinger picture (Sect. 4.4.2) in quantum 
theory. There we also used the Heisenberg picture—then the probability density does 
not depend on time, but rather on the observable r. This picture has the advantage 
that derivatives of mean values with respect to time are equal to mean values of 
derivatives with respect to time. 

If we differentiate the Ornstein—Fiirth relation 


(O) =2(v) T {t — r 1 —e*"*)} 


with respect to time, we obtain 
2 =f 
(r-v)=(v°)T (1 -exp =) : 
T 


If we differentiate this once more with respect to time, then we obtain (v?) + (r - Y) 
on the left and (v?) e~'/* = (v?) — (r - v)/t on the right. It therefore follows that 


(r-v)= er Cone exp ™). 


T 


At the beginning, when |t| <T, it is clear that (r - v) ~ (v?)t and (r- ý) © 
— (v?) t/t, while later, when t >> q, the two correlation functions (r - v) ~ (v?}t > 
0 and (r - y) ~ —(v?) < 0 are constant. These properties, including the sign, are 
easily understood for diffusion: initially, r, v, and v are independent of each other, 
but then a correlation is established, and collisions hinder the diffusion, rather as for 
a frictional force. 


538 6 Thermodynamics and Statistics 


Fig. 6.9 Stochastic force as a function of time. This acts irregularly in time, strength, and direction 
(only one component is shown here) 


This is taken into account by theLangevin equation: 


It is generally set in the form 
F=F’-av, with (F’)=0, 


and œ = m/t is referred to as a frictional constant. We have already investigate a 
Stokes frictional force —av on p. 99. The stochastic force F’ fluctuates irregularly to 
and fro (see Fig. 6.9), and cancels out in the ensemble and the time average. Likewise 
the stochastic acceleration a(t), which differs from the derivative of the velocity with 
respect to time. 

The Langevin equation actually yields the required properties of (r- ý} and (r- 
v). Since no correlations are to be expected between r and a (at equal times), and 


since (r - a) vanishes, we deduce (r- Vv) = —(r-v)/t and in addition 
d(r-v) (r-v) 
grey 
t T 


Since (v?) does not depend on time and (r - v) vanishes initially, 


(r-v) =(v’)t (1 -exp =) 


solves the problem. Then all requirements are satisfied, and the Ornstein—Fiirth rela- 
tion follows (with (r?(0)) = 0) by integrating over time. 

We know the solution of the Langevin equation, because in Sect. 2.3.8 we treated 
the forced damped oscillation and solved a still more general inhomogeneous differ- 
ential equation via a Laplace transformation. The solution to 


X(t) + 2y X(t) + wo? x(t) = a(t) 


6.2 Entropy Theorem 539 


is x(t) = xo(t) + Í dt’ g(t — t') a(t’), where xo (t) and g(t) satisfy the homogeneous 
differential equation and have the initial values x9(0) = x(0), x9(0) = x(0), and 
g(0) = 0, g(0) = 1. We are only interested in the first derivative x, for which, using 
g(0) = 0, we find the expression 


X(t) = X(t) +f dr’ a(t — talt’) . 
0 


The average force is also absent (wọ = 0). Now, we have the simple differential 
equation g + g/t = 0 with g(0) = 1, which leads to g(t) = exp(—t/t). Therefore, 
the solution of the Langevin equation for t > 0 reads 


—t ee —(t —t’) , 
vO =v) exp— + f dt SDS hs 


and from (v(0)) = 0, it follows that (v(t)) = 0. After many collisions, the initial 
velocity v (0) is thus “forgotten”, and likewise the acceleration, the longer back it 
lies. For t —> œ, nothing is forgotten, but then the diffusion constant from the last 
section was much too large. 


6.2.8 Generalized Langevin Equation and the 
Fluctuation—Dissipation Theorem 


So far we have assumed that the collisions are so fast that we could have taken 
the correlation to be (a(t) -a(t’)) « ô(t — t’). We now drop this approximation, 
assuming that the collisions last for a while. We set 


(a(t)-a(t’)) = (v’) pt —2'D, 


because for an equilibrium distribution, only the time difference |t — t’| may be of 
importance, and we leave open the way y may be affected, although it will surely be 
monotonically decreasing towards zero. It is convenient to factorize the fixed factor 
(v’). 

In fact, we only need to modify the solution of the Langevin equation considered 
above, viz., 


v(t) =voxo+ f dr x -= t) a(t), 
0 


insofar as the linear response function x to the perturbation a is no longer equal to the 
old function g(t) = e~"/. In particular, it is determined by (a (t) - a (t’)). Therefore, 
we have to generalize the Langevin equation. Note that the linear response function 
x is sometimes called the generalized susceptibility. 


540 6 Thermodynamics and Statistics 


As before, we assume (a) = 0 and for the equilibrium distribution, i.e., with 
(v (0) - v (0)) = (v?) and (v (0) - a (t)) = 0, we obtain 


v- vE) 


(u?) 


t ë 
=x) x@) +f a” | dt” xt") xE =e ye Se) . 
0 0 


This expression has to be a function of |t — t’|. But how does y depend on x? 
This may be answered by doing a Laplace transform. Instead of {y} as in 
Sect. 2.3.8, we now write 7 for the Laplace transform of y: 


ys) = f dt e™ y(t). 


Because y depends only on |t — t'|, we now consider the double Laplace transform 


~ 00 9a tat 
AS =| ar f dt! e=" y(t rN), 
0 0 


and relate it to the single Laplace transform of y. In particular, using st + s’t/ = 
(s +s)t+s (t — t) and t” = t' — t, it follows that 


=~ Wy os = d —(s+s') t c dt” —s't" n 
y(s,s) = te t e ydel). 
0 —t 


We split the last integral into two, one from 0 to oo and one from —tż to 0, and then 
set t = —t": 


on i [0,6] t 
(s,s) = ze ) +f dt iad dt’ e" y(t’). 
s 0 0 


Since exp{—(s + s’) t} is the derivative of — exp{—(s + s’) t}/(s +5’) with respect 
to t, we may integrate by parts: 


t 
(s+ s^) FCs, s^) _ Ws’) = ests) t [ dt’ a y(t’) 


ie o0 1 į 
+f dt e STS) est y(t). 
t=0 


Clearly, the “boundary values” do not contribute—the factor exp{—(s + s’) t} kills 
the integral for t —> oo, and for t = 0 the integral does not contribute. Since all the 
functions y depend only on |t — t'|, we have the “noteworthy property” 


POPO) 


A 7 
yY (s,s) ca 


The double Laplace transform of (v (t) - v (t’)) reads accordingly, because for this, 
too, only |¢ — t’| is of importance. It contains the expression 


6.2 Entropy Theorem 541 


oo oo o pt t 
L = f a f dt’ esist f an” | dt” X (t vae t") x(t! _ t") y(t” = t”i) . 
0 0 0 0 


If we interchange the order of integration here, i.e., swapping t with ¢” and t’ with 
t”, then t” is integrated from 0 to oo and ¢ from t” to oo, etc. If we then replace 
t — t" > tand?’ — t” — tf’, all four integrals have the limits 0 and oo and are easily 
reformulated: 


oo oo oo oo in 
Z= |: af au" f a f dt’ eS (ttt )—s' (t +t xn x) y(t") 
0 0 0 0 
= X(s) X(s') Ys, 5’) . 


The double Laplace transform of (v (t) - v (t’)) /(v7) is thus equal to 


_ XO) XS) S+G) + XO) ZO) 5+7(9)} l 


X(s) EON + PE, s9} em 


This has to apply to a function which depends only on |t — t’| and therefore has 
the above-mentioned “noteworthy property”. Consequently, X (s) {s + 7(s)} cannot 
depend on s at all, and so has to be a constant. Its value is determined by the require- 
ment x (0) = 1, with v (t) equal to v (0) for t = 0, and is in fact independent of y. If 
we use this for X (s) in the limit s > oo from X ~ x (0)/s, we arrive at the desired 


relation i 


MOS Er 


and hence also obtain the correlation function of the velocities, viz., 


wO vE) = w) xlt- t'i). 


The auto-correlation functions of the acceleration and velocity are thus related to each 
other uniquely, and so also the fluctuations are related to the diffusion. This impor- 
tant discovery is called the fluctuation—dissipation theorem. Instead of the pair of 
notions reversible—irreversible (with respect to time), we take the pair conservative— 
dissipative with regard to the energy. 

For a correlation function 


y(t) = T’ exp(—2yt) , 
with 7(s) = r? /(s+ 2u), the fluctuation-dissipation theorem leads to the func- 
tion ¥(s) = (s+ 2u)/{(stp)? — (u*—T?)}. Since we normally set y(t) x d(t), 


we should expect u >> I’. Using this and the abbreviation v = y u? — r? < u, we 
obtain the correlation function 


x(t) = exp(—ut) [cosh(vt) + M sinh(vf)] . 


542 6 Thermodynamics and Statistics 


For t >> v7!, it takes the form ue exp{—(js—v) t}. 

The connection between x and y is also useful for the derivative of v with respect to 
t. The starting equation leads tov = v(0) X + X @with Z = 1/(s + F), and hence to 
the equation sv — v(0) = @ — 7 D. This expression is equal to the Laplace transform 
of ù. Therefore, we infer the generalized Langevin equation, for which the history 


of the object is important: 


De: (t) ai t—?t') v(t’) 


if (a) = 0 and (a(t) -a(¢’)) = (v’) y (lt — t'1). 

In the last section, we found x(t) © exp(—t/t) for t > 0, which yields X (s) ~ 
1/(s+17'). According to the fluctuation-dissipation theorem, Y(s) ~ t7! was 
obtained, i.e., y(t) ~ 217! 6(t). This also implies 


oo 2 
/ ray aa. 
0 


T 


which, according to p. 536, is equal to nD/t. With (v?) = nkT/m and œ = m/t, 
we also have 


f dt (F'O) -F/(t)) ~x nkT a, 
0 


where F’ = ma is again the statistically fluctuating force. 

Even if we avoid the approximation of the last section, viz., y (t) « 6(t), we may 
nevertheless generally rely on y (t) decreasing almost to zero with increasing t. Then 
it seems worthwhile considering a Taylor series expansion of v(t’) about t’ ~ t in 
the integrand of the generalized Langevin equation. With ¢’ instead of t — 1’, this 
leads to 


sa a-vo far a+ farr COREE 
dt E 0 i dt Jo Y i 

This takes the form of the usual Langevin equation if the first integral does not 
depend upon ¢ at all (and may be set equal to t7!) and the remaining integrals do 
not contribute. These requirements are satisfied if only the average changes in v are 
important, averaged over the collision time, so that y has already decreased to its 
final value. 


6.2.9 Fokker—Planck Equation 


We now consider the distribution function p (t, v) for the velocity. We expect to obtain 
a diffusion equation 09 /dt = D, Ap with D, > 0. The Fokker—Planck equation [7] 
also contains a drift term, since it reads 


6.2 Entropy Theorem 543 


ð Vs- (pv D kT 

U a EE, with D, == = — >0. 

ðt T tT m 

To derive this, we proceed in two steps. To begin with, we consider the Kramers— 


Moyal expansion (in one dimension): 


k=1 


We then justify the claim that it is mainly the first two terms that contribute. Here the 
general Fokker—Planck equation assumes neither that the drift coefficient is D® « v, 
nor that the diffusion coefficient D® has to be constant—it may even also depend 
upon f, not only on v. However, D® > 0 has to hold. 

If, in the short time Aż, the velocity changes by w with the probability density 
P(t,v < t— At, v—w), then 


pa v= [dw Pey t—At,v—w) p(t—At, v—w) . 


If we restrict ourselves for the time being only to motion along a straight line, then 
a Taylor expansion about w = 0 delivers 


P(t,v < t— At, v— w) p (t—At, v—w) 


CO x k k 
= (W) (Š) P(t, v+w < t—^t, v) p (t— ^t, v). 


Therefore, we introduce the moments 
(w*) = [ow P(t,v+w <t—At, v) w*. 


They depend upon v, t, and At. With P(t, v < t, v — w) = 6(w), all moments with 
k > 0 have to vanish for At = 0. In contrast, (w?) is always equal to 1. For the 
determination of 9p /ðt, we may restrict ourselves to the linear terms in At (the term 
k = 0 does not contribute), and using 


jah) 


7 = D(t,v) At +, with ke {1,2,3,...}, 


we atrive at the above-mentioned Kramers—Moyal expansion 


544 6 Thermodynamics and Statistics 


Here it is clear that none of the coefficients D with even k are negative, because 
the probability density P has this property. 
To derive the Fokker—Planck equation, we now have to consider the expansion 


coefficients i 
1 ə 
D®¢,r) = — L, 
k! oAt 


They can be determined from the Langevin equation ý = a — v/t with (a) = 0. If 
in the time Af, the collision acceleration averages out, and on the other hand At 
nevertheless remains so small that we may restrict ourselves to the linear term, we 
may conclude that (w) = —v At/t, while for short times, only the auto-correlation 
of the collision accelerations contributes to (w°): 


At oo 
(w?) x I dr’ dt” (a(t) -a(t")) © ar f dt (a(0O)- a(t)) = 2D, At. 
0 = 


[0,0] 


Here the expansion coefficients D® vanish for k > 2 if for even k, we start from 


(al) alt) = D> at) alt) (alt) a(t) , 


all pairs 


and a similar sum for k + 1, where each term also contains a further factor (a). This 
ensures that (w7*t!) vanishes for « > 0. In addition, it then follows that (w*") œ 
(At)*, so only D and D® actually remain different from zero. 

With this we can now derive the Fokker—Planck equation (in the three-dimensional 
space, correlations between the different directions are not expected): 


dp Və: pv 
or 


+ Dy Ay p=(3+v:Vs+tD, Ay) L. 
T 


Reformulation can help us find solutions. The average term vanishes if we introduce 
the variable u = v exp(t/t) instead of v (p. 43 is useful for such reformulations): 


E A aa (2), = Grete Mee 


Therefore, with o now a function of t and u, and with A, = exp(2t/t) A„, we arrive 
at 


ð 2t 

2e. = (3+tD, exp — Au) £ 

ot T T 

The first term on the right-hand side disappears if we consider the differential equation 


for f = pexp(—3t/t): 


6.2 Entropy Theorem 545 


Fig. 6.10 Fokker—Planck 
equation. Diffusion equation 
with a drift term (see Fig. 6.8 
for the situation without this 
term). Also represented are 
initially sharp solutions for 
the times b T (red curve), zt 
(blue curve), and T (green 
curve). At the beginning, 

(v) = —3./Dyt holds. The 
stationary final distribution is 
the dashed curve 


Finally, we also set t’ = ir {exp(2r/t) — 1}, and with dt’ = exp(2t /t) dt, we obtain 
the diffusion equation (in the velocity space) 

af 

—=D, Auf. 

ot’ f 
According to p. 536, its solution is f = ./47 Dyt’ = exp{—(u — uo)? /4D,t'}. Using 
this, and if the initial velocity Vo is given as sharp, the desired solution of the Fokker— 
Planck equation reads (see Fig. 6.10) 


1 at —{v — Vo exp(—t/t)}? 
J2xtD,{l —exp(—2t/t)}— 2t Dv {1 — exp(—2t/r)} ` 


p(t, ¥) = 


Consequently, the mean value (v) = vo exp(—t/t) decreases down to the equi- 
librium value 0. But the drift term also limits the squared fluctuation, viz., 


—2t 
(Av)? = 3tD, (1 — exp —) ; 
T 


which then approaches the equilibrium value 3r D, twice as fast (with half the relax- 
ation time +t )—otherwise, with t very large compared to the observation time 
t, it would have increased permanently with (Av)* = 6D,t. This time-dependent 
squared fluctuation helps us even for the distribution function: 


u — Vo saii l 


1 
p(t, v) = UET wO) exp| 2 Av(t) 


For t > t with (Av)* > 3t D, = 3kT/m, it goes over into the equilibrium distri- 
bution 


546 6 Thermodynamics and Statistics 


7 exp(—4mv*/kT) 
JIAKT Jm? 


We shall derive this Maxwell distribution again in a different way in Sect. 6.3.1. 


p(y) 


6.2.10 Summary: Entropy Law 


Our aim here was to justify the thermodynamically important entropy law. The 
entropy of a closed system can only increase as time goes by, never decrease. This 
holds for macroscopic systems with many degrees of freedom if we describe them 
with only a small number of variables, and in any case, we could by no means account 
for all of them. If the entropy of a closed system increases, it changes irreversibly, 
even though all the basic equations of mechanics and electromagnetism remain the 
same under time reversal. The entropy law follows from the rate equation. A partic- 
ularly impressive example of a rate equation is supplied by the Boltzmann equation. 
It holds for a gas of colliding molecules, as long as their probability distributions are 
uncorrelated (the assumption of molecular chaos). 

The increase in the entropy in closed systems does not contradict the observation 
of biological systems, which always become more intricate, and hence less probable. 
They are not closed systems. 


6.3 Equilibrium Distribution 


6.3.1 Maxwell Distribution 


The collision integral in the Boltzmann equation vanishes for collisions of identical 
molecules, if (see p. 527 for detailed equilibrium) 


/ / 
p(t, r, Vi) p(t, r, V2) = p(t, r, vi) p, r, V2). 
Energy and momentum conservation also impose the constraints 
v? +u su? Ho, vitvw=vwi' Hv. 


Consequently, for elastic collisions, (v; — Vo)? + (V2 — Vo)? is conserved for arbi- 
trary Vo. The first equation may be brought into this form: 


In p(t, r, v1) + In p(t, r, v2) = In p(t, r, vi’) + In p(t, r, v2’) . 


6.3 Equilibrium Distribution 547 


Note that the sum of two one-particle quantities is conserved. Since both v; and v2 
may be chosen quite arbitrarily, the general solution is 


Inp=-A (V — vo)? +1ncC, 
and this yields the local Maxwell distribution 
p(t, r, Y) = C(t, r) exp{—A(t,r) (V — vott, r))°} , 
with initially arbitrary functions C(t, r), A(t, r), and vo(t,r), provided that it is 
normalized correctly, i.e., f&r dv o(t,r,v) = 1. 
Let us take here the special case in which the probability density depends only 


on v. We then have p(v) = C exp{—A (v — Vo)*} with [Pv p(y) = 1. This Gauss 
distribution is symmetric with respect to vo. Therefore, 


(Vv) =Vo. 
Consequently, vo is the average velocity of a molecule. The normalization requires 


C = (A/z)*/?, and the parameter A is related to the squared fluctuation in the velocity 
by (Av)? = 3 A7~!. We thus obtain 


= 2 
sere 3 (v vor) 


1 
(VITI Av)” = 2 (Av) 


This is the famous Maxwell distribution, if we take (Av)* as a measure of the dis- 
orderliness of the motion and relate the associated kinetic energy to the temperature 
according to 

im (Av? = 3kT, 


by setting (Av)? = 3kT /m, as discussed on p. 534. 

If we restrict ourselves to gases which are on the average at rest (something that 
can always be realized with suitable coordinates), then vo = 0, and the distribution 
is isotropic. Only the modulus of v is important in this case. Using dv = v? dv dQ,, 
if we require i dv p(v) = 1, then 


4r v? exp(—imv?/kT) 
J2xkT/m° l 


Clearly, the maximum of p (v) is at Ò = /2kT/m, and thus 4m 0? = kT. The mean 
value of the modulus of v lies somewhat higher, namely at (v) = (2/,/7) D. 

But instead of p (v), we often consider p (E), the distribution with respect to the 
kinetic energy E, and use dE = mv dv: 


pv) = 


548 6 Thermodynamics and Statistics 


0.1 
0.04 _ 0.0 Z 
0 1 2 3 v/? 0 5 10 E/E 


Fig. 6.11 Maxwell distributions. p(v) (left), pCE) (right) in suitable temperature-independent 
units: 0 = ./2kT/m and E = nar 


2 /E/kT exp(—E/kT) 
Ja kT f 


The maximum of this distribution lies at E = EKT, and its mean value is (E) = 
kT = 3E. The uncertainty is AF = ./3/2 kT (see Fig. 6.11). 


P(E) = 


6.3.2 Thermal Equilibrium 


The Maxwell distribution is an equilibrium distribution, because it was expressly 
assumed that collisions do not alter anything. Therefore, in particular, the entropy is 
also conserved, despite the collisions. 

Generally, thermal (thermodynamic or also statistical) equilibrium exists if the 
entropy does not change with time by itself. Such an equilibrium always exists if we 
consider closed systems with an entropy as high as possible. Of course, all parameters 
which characterize our statistical ensemble must then be given as fixed. 

In the Schrédinger picture, a sufficient equilibrium condition is 


0 
oa =0 => equilibrium, 


since then neither p nor the mean values {(A;)} depend upon time, including the 


entropy. With the Liouville equation, the constraint 09 /dt = 0 may also be replaced 
by the requirement 


n y êk 3o OH do 
ax* Ape  Əpk axk 


6.3 Equilibrium Distribution 549 


This is satisfied if, instead of the distribution function p with its 6N variables, we 
take the distribution function o (H) with the energy as its only variable. Then, 


and the Poisson bracket [H, p(H)] always vanishes. 

In quantum theory, stationary states are eigenstates of the Hamilton operator: their 
density operator pọ commutes with H. Conversely, from [H, p] = 0, in the energy 
representation, it follows that (E; — Ex) (z| p |z’) = 0. If there is no degeneracy, i.e., 
E, # Ey for z Æ z’, then the density operator of an equilibrium state is diagonal: 
(zl o |z") = p (E) (z|z'), or p = D7. |z) o (E) (z|. Here p(E;) is the probability of 
the state |z) with energy E,. (We divide possible degeneracies into two classes, 
namely those which spring from special symmetries of the Hamilton operator, and 
those which are merely accidental. We account for symmetries by further quantum 
numbers, or simply multiply p (Ez) by the number of degenerate states. However, we 
shall disregard accidental degeneracies here. We assume that accidental degeneracies 
occur so rarely that they have no statistical weight.) 

The above-mentioned equilibrium condition 00/dt = 0 may also be replaced by 
the sufficient constraint that p depend only on the energy. (However, this is not nec- 
essary, because according to the Liouville equation, for degenerate states there may 
also be entropy-increasing exchanges without energy change.) In the following, we 
shall determine several canonical distributions for different equilibrium conditions. 
Here we must always make an assumption concerning the energy with reference to 
the equilibrium conditions. 


6.3.3 Micro-canonical Ensemble 


Closed systems belong to a micro-canonical ensemble if they have the same external 
parameters, their energy lies in the interval between E and E + dE, and they are in 
equilibrium. Their entropy is then as high as possible, otherwise it would not be an 
equilibrium. According to Problem 6.6 (Sect. 6.1.6), all Zmc permitted (accessible) 
states have the same probability, the values resulting from the normalization of p: 


Zmc™!, forE<E,<E+dE, 
pmc (Ez) = X 

0, otherwise . 
The constant Zmc, which is the number of states in the considered energy regime, is 
the partition function. Note that, since the letter Z is the generally accepted notation 
for the partition function, we count the states with z and the upper boundary is called 
Z. Here the partition functions are related to the various ensembles, which is why 
we append the subscript “MC” for micro-canonical. 


550 6 Thermodynamics and Statistics 


The energy values E, depend on the given problem. We shall take care of this 
later. Here we are interested primarily in the question of the probabilities with which 
the single energies occur in the ensemble, in order to make the entropy as high as 
possible, since this determines the equilibrium. 

The idea of requiring equal “a priori probabilities” is suggestive even without 
considering the entropy. It is the only sensible assumption, as long as there are no 
reasons to prefer certain states over others in the considered regime. For any other 
distribution, there are irreversible transitions between the states until equilibrium 
is reached, at which point the entropy is maximal. According to Sect. 6.1.6, this 
highest entropy is S = k ln Zyc. It belongs to Zc states with equal probabilities 
pz = ine 

It is often claimed that the entropy S may be expressed in terms of the thermo- 
dynamic probability W in the form S = k In W, even though it is admitted that this 
“probability” might be greater than one, which contradicts the notion of probabil- 
ity. In contrast, there is a corresponding equation with the micro-canonical partition 
function Zyc rather than the thermodynamic probability W. In some sense though, 
this partition function may be connected to an occurrence, and relative occurrences 
do lead to probabilities. In this context, we compare two micro-canonical ensembles: 
the original one with the partition function Zyc and another, which is less restricted 
and also contains other states. Then its partition function Zmc> is greater than Zc. 
According to the basic assumption of equal a priori probabilities, the probability of 
a state of the original ensemble in this larger ensemble is given by Zmc/Zmc>». Here 
Zmc> is in fact not uniquely fixed, but this freedom “only” relates to the zero of the 
entropy: the denominator necessary for the normalization in fact shifts the origin of 
the entropy, but what is important are usually only differences in entropy. 

The relation S$ = k In W is called Boltzmann’s principle. From W = exp(S/k) 
and $ > 0, it follows that W > 0, which tells us that the “disorder” in an isolated 
system can only increase as time goes by. 


6.3.4 Density of States in the Single-Particle Model 


For macroscopic bodies, the density of the energy eigenvalues E, increases approx- 
imately exponentially with the energy, as we shall now show with a particularly 
simple example. 

We consider a system of very many distinguishable particles which all feel the 
same average force, but no rest interaction—thus without correlations between the 
particles. (As long as the rest interaction can be treated with perturbation theory, the 
results barely change. The levels may move relative to each other, but this affects 
neither the partition function nor the average level density.) According to quantum 
theory, the one-particle potential fixes the one-particle energies and hence also the 
number of states below the energy E, which for the N -particle system we shall denote 
by Q(E, N). Note that, on p. 525, we wrote Q for Q(E, 1). We now have 


6.3 Equilibrium Distribution 551 
Zuc = Q(E + dE, N)— Q(E,N), 


and instead of summing over z, we may also integrate over the energy, if we take the 
density of states 0Q/dE as weight factor: $`, = {dE dQ/dE. 

Since we have assumed only particles that are independent of each other, and 
therefore neglect correlations, for this “number of states”, we have 


Q(E,N) ~ QN(E/N,1). 


Here the approximation consists in saying that not all particles have to have the same 
energy—only the total energy is given. But we shall soon see that for sufficiently 
large N, Q(E, N) depends so strongly on the energy that other energy separations 
barely contribute to the density of states. The number of one-particle states does not 
in fact depend particularly strongly on the energy, e.g., according to p. 525, for a gas 
of interaction-free molecules, we find pè œ E?/?, But the huge power N leads to a 
very strong energy dependence of Q (E, N) for the N-particle system. In particular, 
if Q(5E, $N) =a E” holds with M > 1, then the product is 


1 1 1 1 2p 2\M 
a(S +9.5N)-2(5e-9. 5") =a (E ey). 


Even for ¢/E = /a/M, this is smaller than a? E7” by the factor e~®, e.g., with a 
millimol and ¢/E = 107°, whence a = 3 x 6 x 10% x 10718 = 900 by nearly 400 
orders of magnitude. Therefore, only Q(E/N, 1) is actually important. An example 
is shown in Fig. 6.12. 


0 0.1 €/E 


Fig. 6.12 The number Q(£, N) of states up to the energy E of an N-particle system decreases 
rapidly if the energy is not distributed evenly over all particles. Here, one half has the energy 
E2 = 5 (E — £) and the other half the energy E> = 5 (E + £). We plot the ratio Q (E<, 5N) ` 
Q(E>, 3N)/QUE, N) against e/E for N = 1000 (dashed curve) and for N = 2000 (continuous 
curve) 


552 6 Thermodynamics and Statistics 


L I | JJJ 


Fig. 6.13 Probability distribution p (Ez) of a micro-canonical ensemble of 100 particles in a cube 
as a function of E,. Here the density of states increases with E,. The higher energies in the allowed 
regime contribute more strongly than the lower ones 


For an energy shift E > E + òE, the function Q(E, N) changes so much that a 
Taylor series makes sense only for its logarithm: 


an Q(E, N 
on EN) R; 


nQ(E+8E,N) + nQ(e£, N) + JE 


Here the factor in front of òE is huge, namely 3 N/E for Q « (E?/*)., Even for one 

millimol and 8E/E = 107°, In Q increases by nearly a trillion—and the number of 
states increases in this approximation exponentially with the energy ôE to 

dn Q(E, N) 

Q(E +8E, N) ~ Q(E, N) exp(— SE) l 


This property of the partition function or of the density of states 392/3 E leads us to a 
new problem: for all mean values of the micro-canonical ensemble, the upper energy 
regime is much more important than the lower one. Here, only the mean value of the 
energy is accessible to us macroscopically, so we should give (£) and not start from 
the micro-canonical ensemble (see Fig. 6.13). 

Note that the density of states also increases with the particle number N and 
the volume V as strongly as with the energy E, because the above considerations 
may be transferred to all other extensive parameters. By an extensive parameter, 
we understand a macroscopic parameter which is proportional to the size of the 
system, like the particle number, the energy, and the volume. In contrast, intensive 
parameters keep their value under subdivision of the system, e.g., the temperature 
T and the pressure p. 


6.3.5 Mean Values and Entropy Maximum 


For all “canonical ensembles” except for the micro-canonical one, we always fix 
average values: for the canonical ensemble, the energy (F), for the grand canoni- 
cal ensemble, also the particle number (NV), and for the generalized grand canoni- 
cal ensemble also other mean values, such as the volume (V), which for the other 
ensembles is given precisely, just as the particle number N is given precisely for the 
canonical ensemble. 

We now search for the general distribution {o,} with the highest entropy which 
is consistent with the constraints given by the mean values (A;). Here we take only 
mean values of extensive quantities, such that the error widths remain as negligible 
as possible. 


6.3 Equilibrium Distribution 553 


An indispensable constraint is tro = (1) = 1. Therefore, we begin with i = 0 and 
set Ag = 1. Forn further constraints, i runs up to n. With the Lagrangian parameters 
—ki,; for the unknown p,, we have the variation problem 


(5-5 nw) <0. or DDr (Ine. Sora) =0. 
i=0 z 


i=0 


The extremum is obtained from In p; + }`;_—o Ai Aiz + 1 = 0 and leads to 


Pz = exp (- = Xo a.) = exp (- oa a.) /exp(. + ào). 
i=0 i=l 


The Lagrangian parameter Ag follows from the norm trp = 1. If no further mean 
values are given, then the highest entropy belongs to p; = 1/Z with Z = }ł_, 1, as 
we know already from Sect. 6.1.6. Otherwise, taking the partition function 


Z= X exp (-S a) ; 
z i=1 
with > pz = 1, we have the equation exp(1 + Ao) = Z. Hence, Ap = In Z — 1 and 


1 n 
z= > = Ài Aiz 3 
EE Ean 


The remaining Lagrangian parameters à; are related to the corresponding mean 
values: 


1 z 1 əZ əlnZ 
Aja — `A; =e A| = = . 
a 3 z exp | -J Aj Ae Z oh DA; 


The mean values (A;) thus follow from derivatives of the partition function Z, so 
we have to determine Z(à1, ..., Àn) such that, for alli € {1,..., n}, the equations 
(Ai) = —9 ln Z/ðå; are satisfied, where the remaining Lagrangian parameters À ; 
with j Æ i are to be kept fixed. 

We have thus found the constraints for the extremum of S[ọ]. It is a maximum, 
because —k p; (In p; + )>/_9 Ài Aiz) differentiated twice with respect to p; is equal to 
—k/p: < 0. We shall investigate the physical meaning of the Lagrangian parameters 
Ai, --+,An in Sect. 6.3.8. These are adjustable parameters and lead us among other 
things to the temperature and the pressure. 

Note that the partition function also yields the squared fluctuation of (A;), because 
from 


554 6 Thermodynamics and Statistics 


1 8əZ 9871 8Z\ 178Z\2  Ə(A;) 
=F meme a) ( ) = 


we deduce 
(AA)? = — 


Since the squared fluctuation is non-negative, the partial derivative must not be pos- 
itive. If it is zero, then there is no unique relation (A;) —> A;. Otherwise (A;) is a 
monotonically decreasing function of A;, and so A; is a monotonically decreasing 
function of (A;). Clearly, also (AA;)* = 87 In Zieh? holds. 

If the mixed derivatives 97 In Z /(0A; 04 ;) are continuous, then the order of the 
derivatives may be interchanged. Then we arrive at the equations 


These are Maxwell’s integrability conditions (Maxwell relations), which will turn 
out to be useful later on. 


6.3.6 Canonical and Grand Canonical Ensembles 


For the canonical ensemble, the mean value of the energy (£) is given, in addition 
to the norm (1). According to the last section, we then have the canonical partition 
function: 


Zo = Yl exp(-AcE;) = trlexp(—AzE)] 
and the probability distribution 


1 
= — exp(-ArE£). 
Pc Zo p(—Àz E) 


Note that the Lagrangian parameter àp is related to the energy, but the letter 6 is 
usually used, even though £ will be used for the pressure coefficients (see p. 619). 
Here, for brevity, we have left out the index z for pc and E. For the same reason, the 
trace notation is convenient for the partition function. If states are degenerate, we 
have to multiply by their degree of degeneracy. 

For canonical ensembles, what is important is thus to know how the given mean 
value (E) depends upon the adjustable parameter àg. According to the last section, 
we have (see Fig. 6.14) 


6.3 Equilibrium Distribution 555 


Fig. 6.14 The level density re`? n"e” 
increases approximately as 1.0 
E>, and the occupation 

probability decreases as 

e “=. Hence x”e~* is 

important, with maximum 

for x = n. Here this function 05 
is shown for n € {2, 4, 8, 16} 

relative to its maximum, and 

therefore as a function of x/n 


We shall later relate the temperature to the parameter àg. Indeed, we shall find that 
àg is the reciprocal of kT. 

For any canonical ensemble of macroscopic bodies, only a small energy range 
òE is of importance. If we approximate its partition function Zc by an integral of 
the energy with the integrand f(E, N) = exp(—A gz E) 0Q(E, N)/dE, then for large 
N, f(E, N) has a very sharp maximum at E. For the density of states of a gas of 
interaction-free molecules, for example, the tl ha f(E) x exp(—AgE) E7N/? 
is to be considered near its maximum at Ẹ = 3N /Xe, and after a Taylor series 
expansion, 2 P 

f(E+8E) ~ f(E) exp{—N (8E/E)’} , 


we find a Gauss distribution with the tiny width E/,/3N/2 « F. (Who would ever 
determine the energy up to twelve digits for one mole?) Consequently, we have 
Ew (E), and for such a sharp maximum, only the states from the nearest neigh- 
borhood are important. Therefore, the canonical and micro-canonical ensembles 
are very similar—the energy uncertainty (via àg) is given instead of the energy 
range dE. Therefore, a distribution parameter àg may even be assigned to a micro- 
canonical ensemble, and with this a temperature, as will be shown in Sect. 6.3.8. 
The requirement is that exp(—Ag E) 0Q2/0E should have its maximum at E , which 
requires Ag 0Q/0E = 3 N/3E?, or Ag = ð In (9Q/IE)/IE for each E = E, Le., 
(kT)! = AIn(@Q/IE)/IE|z. 

For the grand canonical ensemble, in addition to (1) and (E), we also fix the 
particle number (N) only on the average. Then we have 


1 
pac = exp(—AgE—AyN), 
Zac 


with 
Zoc = trexp(—AgE —AyN) = J exp(—AwN) Zc(N) . 


556 6 Thermodynamics and Statistics 


Even more mean values characterize the generalized grand canonical ensemble. 
For this, in addition to (1), (EZ), and (NV), further quantities (V;) are given, e.g., the 
average volume. Then we have 


1 
pag exp(—AgE — àÀnN — }; ài Vi), 


with 
Z = tr exp(—ÀgE —anN — } ài V;) 
and 
ainZ HE 
pa. arpa- El 9, 
Jig IAr 
TE alnZ an aN) o 
O IAN” ~ Oday 
ainZ alV; 
mee =, Avian Sy 
OX; OA; 


In the following, we shall imagine as the other quantities V; only the volume and 
then, instead of 2 ài Vi, take only Ay V. Here we shall sometimes fix the particle 
number, thus give only (1), (E}, and (V) as mean values. This ensemble has no 
special name. 

According to the last section, the entropy is 


exp(— x Ài Aiz) 7 
A 2 =k(InZ+)°A;(Aj)). 


i=l 
Using this, for generalized grand canonical ensembles, we obtain 
S=k (InZ+Ag(E)+An(N) +Avy(V)), 


with somewhat simpler expressions for canonical and grand canonical ensembles, 
which are not so important at the moment, because we also wish to investigate the 
dependence on (NV) and (V). Here Z is a function of the Lagrangian parameters 
Àg, An, and Ay (see, e.g., Fig. 6.15). We investigate the canonical partition function 
Zc(Az, N, V) on p. 575 and the grand canonical partition function Zgc(Ar, Aw, V) 
on p. 579. 

In the following, we shall usually drop the bracket symbols ( ), because we 
consider only mean values anyway, if not explicitly stated otherwise. In addition, 
we adopt the common practice in thermodynamics of writing U for the energy E. 
It is referred to as the internal energy, bearing in mind that there are also other 
forms of energy. In Sect. 6.3.1, for the Maxwell distribution, we divided the kinetic 
energy into the collective part 7 (v)? and the disordered part a (Av)*, since we have 
(v?) = (v)? + (Av). For such an ideal gas, only the disordered motion counts for 


6.3 Equilibrium Distribution 557 


V/Vo 


E 


Fig.6.15 Ifthe volume V or another extensive parameter changes, then every energy eigenvalue £; 
also changes, and so therefore does the density of states 0982/0 E, here shown for the same example 
as in Fig. 6.13, viz., 100 molecules in a cube 


the internal energy, the collective center-of-mass motion being considered as one of 
the macroscopic parameters. 


6.3.7 Exchange Equilibria 


For inhibited (partial) equilibria with special “constraints” (inhibitions), only parts 
of the system are in equilibrium (each with an entropy as high as possible), which 
for the total system without the inhibition would have a higher entropy. It is not 
in total (global) equilibrium. We are not interested in the exact description of the 
transition from partial to total equilibrium under removal of the inhibition—for that, 
we would have to solve rate equations. Here the initial and final state suffice: the new 
equilibrium is reached by suitable alterations of the partial systems—an exchange 
equilibrium (total equilibrium) then develops. 

We exemplify by considering two separate closed systems, each of which is in 
equilibrium and has the average energy U,, and entropy S, (n € {1, 2}). If the two 
systems come into contact, in most cases, the total system will not yet be in equilib- 
rium: then the two parts exchange energy, as long as the total entropy increases, i.e., 
Se > Si. Here it is assumed that the coupling is so weak and the energy exchange 
so slow that, for the total energy, U = U; + U2 always holds and the probability 
distribution always factorizes (and thus $ = S1 + S2 holds). In the new equilibrium 
state, the total entropy is then as high as possible: 8S = 8S; + 852 = 0 under the 
constraint 8U = U1 + 8U2 = 0. Exchange equilibrium with respect to the energy 
leads to the requirement 8S = )°,(0S,/dU,,) ŠU, = 0, thus to 


dS; 3S 


— = —, or àg = Àp, 
aU, 30 El E2 


since we have S$ = k (In Z + àgU)—because Z is a function of àg and hence of U, 
thus Z(àg(U))—and this implies that 


558 6 Thermodynamics and Statistics 


10S _(amZ de 
key \ he ae 


and also, for U = —d ln Z/dAg and 0U/dAg = —(AU)’, 


10S —U+U 


k aU ~(auy AE 2 


Note that, in the partial derivatives, N and V, or Ay and Ay, are held constant. The 
equilibrium state of systems in thermal contact can thus be recognized by all parts 
having equal distribution parameter A gn. 

These considerations are clearly valid not only for the energy U, but also for the 
particle number and the volume. Under the constraint èN = 0, 6S = 0 delivers 


dS; ƏH ET 
IN, IN: N1 5 AM2, 


and under the constraint òV = 0, 8S = 0 delivers 


dor 0M or Ay; =A 
IV v” vı = Ày2. 


The exchange equilibrium is only reached if the Lagrangian parameters in all parts 
agree with each other. 

Now we can better understand how reversible and irreversible changes of state 
are distinguished. In the last section, we removed closed systems of inhibitions and 
local differences were then equalized, e.g., by diffusion or temperature adjustment. 
Such a change of state proceeds by itself and is not reversible, but irreversible—and 
the entropy increases. 

However, we may also modify external parameters, e.g., supply energy. This 
may also happen reversibly, or one part reversibly and another part irreversibly. The 
change is then reversible if it proceeds solely through equilibrium states. However, 
this constraint is only satisfied if no internal equalization is necessary. 


6.3.8 Temperature, Pressure, and Chemical Potential 


According to p. 513, the zeroth main theorem of thermodynamics states that: There 
is a state variable called temperature T and two parts of a system are only in thermal 
equilibrium if they have the same temperature. This equilibrium depends in particular 
on the possibility that energy may be exchanged. Like àg, the temperature is the 
same in all parts—the two parameters describe the same situation. The larger Àg, 
the more important are the states of low energy, and the cooler the considered body: 
àg is inversely proportional to the temperature. They are related by the Boltzmann 
constant k according to 


6.3 Equilibrium Distribution 559 


as we shall show. This also implies 


ð a 
ee Ry 
3AE aT 


If the average energy is given, then the (thermodynamic) temperature T characterizes 
the equilibrium distribution, but if the energy has to be sharp, then the notion of 
temperature is useless. 

However, the zeroth main theorem only states when two temperatures are equal. 
We could also take another function f(T) as the temperature. In this sense any 
uncalibrated mercury thermometer serves its purpose within its measurable range, 
but without a gauge, not even temperature differences can be given uniquely. Thus 
for a canonical distribution, the thermodynamic temperature is uniquely determined 
by T = (kAg)~!. Then the behavior of macroscopic models, e.g., of an ideal gas, can 
be determined as a function of the temperature (or of the parameter àg), and hence 
a gas thermometer can be constructed as a measuring device. In Sect. 6.5.4, we shall 
prove the thermal equation of state for ideal gases (the Gay-Lussac law), viz., 


pV =NIT, 


from which the gas thermometer gauge may be derived. And we shall actually prove 
DV =N/,g there! 

It is immediately clear that, for T = 0, special situations occur, since then Ag = 
oo. Now for all equilibria with a finite energy uncertainty (with T > 0), 


ðU 
(AUy == an — li 


With decreasing temperature T, the internal energy U thus also decreases, implying 
that the states of low energy are preferentially occupied. In this limit, only the ground 
state is occupied, if itis not degenerate. Correspondingly, the equilibrium distribution 
for T = 0 only depends on whether or not the ground state is degenerate, and likewise 
the entropy. If there is no degeneracy, then p; is different from 0 for only one z and 
hence S = 0. This property is called the third main theorem of thermodynamics. 

In classical statistical mechanics, the following equidistribution law can be 
derived: All canonical variables (positions, momenta) which occur in only one term 
in the Hamilton function, and there as squared, contribute the value 5 KT to the inter- 
nal energy in a canonical ensemble. For the proof, we take the Hamilton function 
H = Hy + cx”, where Hp and c do not depend upon the coordinate x. In a canonical 
ensemble, this variable x contributes 


560 6 Thermodynamics and Statistics 


dx =i 2 2 a oo 
2 i A in f dx exp(—Agcx") 
E = 


[0,6] 
dx = = 
i Paes [dx exp(—Ag cx?) JA 


oe) 

to the internal energy. The integral has the value ./7/(A¢ c), whence 4 In Àg has to 
be differentiated with respect to àg, which results in 5 [ÀE = 5 kT. This proves the 
equidistribution law. 

Then, for example, for force-free motion, the squares of the components of the 
momentum for the three space directions enter as separate terms—a single free par- 
ticle thus has the energy im (Av)? = ZkT, as claimed on p. 547 for the Maxwell 
distribution, and now proven. Consequently, ideal gases with N atoms (without inter- 
nal degrees of freedom) have U = 3 NkT. Correspondingly, for the linear harmonic 
oscillator, the internal energy is Z kT. The virial theorem in mechanics (see p. 79) 
then shows that (Epot) = (Ekin). It thus also holds in quantum theory, but it should 
be noted, however, that it often delivers discrete energy eigenvalues so the above- 
mentioned integrals are then sums `, po (Ez) Ez, which for low temperatures leads 
to deviations from classical statistics. This shows up quite clearly in connection with 
the freezing of degrees of freedom. 

If two parts exchange not only energy, but also volume, then not only do their 
temperatures become equal, but also their values of the parameter Ay. It is common 
to set 


p 
Av = DAr= — 
vV=PE =T 


, 


because pV is then an energy. This means that p is an energy/volume = force/area 
and has the unit N/m? = Pa = 107% bar. In addition, for fixed Ag (> 0), (AV)? = 
—dV/dAy > 0 implies the relations 0V/dp < 0 and dp/dV < 0. If the volume 
decreases, then p increases, provided that no other parameters change: p is the 
pressure with which the system acts on the container walls. It is only when it is the 
same in all parts that any volume exchange will cease. 

Correspondingly, the Lagrangian parameter A, becomes the same in all parts of 
a system if particles can be exchanged. We assume that the temperature becomes 
equal and set 

u 
Then uN is an energy, and so is u, the chemical potential. Like temperature and 
pressure, it is a distribution parameter and important for chemical reactions, as will 
be shown below. Since (AN)? > 0, we have ƏN/ðu > O for fixed Ag (> 0). As 
observed, e.g., in Figs. 6.19 and 6.22, the chemical potential is often, but not always 
negative. 

For materials involving different types of particles, the expression uN in the 
exchange equilibrium is replaced by `; u;N;, as will be proven in Sect. 6.5.5. 
However, chemical equilibria have to be treated separately, because the molecules 
are counted as particles, but in chemical reactions, only the number of atoms is 
constant, and not necessarily the number of molecules, e.g., not for 2 H:O —> 2 
H2+O>. If we take X; as a symbol for the ith sort of molecule, then we have 


6.3 Equilibrium Distribution 561 
> Vi X i = (0) ; 
i 


where the stoichiometric coefficients v; are positive for reaction products, negative 
for reaction partners (and then integers as small as possible)—in the above-mentioned 
example, they take the values —2, 2, and 1. After dn reactions, we have dN; = v; dn 
(actually n is a natural number, but we may go over to a continuum by referring to the 
very large total number). This implies 8S = }_;(3S/əN;) v; dn = 0 as equilibrium 
condition. Then, according to the last section, 2: Anil; = 0, and hence, 


X vmi=0. 


We shall use this equation on p. 588 for the law of mass action for chemical reactions. 


6.3.9 Summary: Equilibrium Distributions 


Equilibrium distributions do not change with time—the entropy is as high as possible 
given the constraints. This happens if the probability distribution depends only on 
the energy. For the micro-canonical ensemble, all states in the energy range from E 
to E+dE are occupied with equal probability. For the other canonical ensembles, 
some parameters are given only as average values (for macroscopic systems, the 
fluctuations about the mean value are normally extremely small). To each mean 
value there is a distribution parameter which, in the exchange equilibrium, is the 
same for all parts. To the energy corresponds the temperature T, to the volume the 
pressure p, and to the particle number the chemical potential u. Here the Lagrangian 
parameter àg = 1/kT, Ay = p/kT, and Ay = —u/kT were initially introduced 
as distribution parameters. For n given mean values {(A;)}, the partition function 
Z = trlexp(— )>j_, 4; A;)] turns out to be useful because (A;) = —d ln Z/0A; and 
(AA;)? = 87 In Z/0A;7. 


6.4 General Theorems of Thermodynamics 


6.4.1 The Basic Relation of Thermodynamics 


From the relation for the entropy of a generalized grand canonical ensemble, we shall 
now derive the following important equation of macroscopic thermodynamics: 


dU =TdS— pdV+udn. 


562 6 Thermodynamics and Statistics 


Since we take the equilibrium expression for S, it holds only for reversible changes 
of state, or at least for changes of state in which so far all external parameters have 
been kept fixed, so dV = 0 and dN = 0. 

In Sect. 6.3.6, we derived the equation 


for the entropy. Here the partition function Z is a function of the three Lagrangian 
parameters à g, ày, and à y, and according to the same discussion, (A;) = —d ln Z/0A; 
implies dln Z = —U dag — V dày — N day, and hence 

dS = k (Ag dU + ày dV + ày dN). 


According to Sect. 6.3.8, the Lagrangian parameters àg, Ay, and Ày are related to 
the temperature 7, the pressure p, and the chemical potential u: 


ÀE = —, ày = pàE, and An = —HUÀE. 


Consequently, for T Æ 0, 


_ dU + pdV —pdN 
~ T 


ds 


’ 


and we have thus proven the claim that dU = T dS — pdV + pdN. 

For the grand canonical ensemble, the term — p dV does not occur, because the 
volume is to be kept constant, and for the canonical ensemble, the term u dN is 
also missing, because the particle number is then also fixed. Particularly often, the 
equation is used with dN = 0, namely, in the form dU = T dS — pdV. 

If the changes in the state quantities do not proceed purely through equilib- 
rium states, but nevertheless begin and end with such states, then, in addition to 
the reversible change of state just treated, there will also be an irreversible one. 
According to the entropy law—and from now on we always assume dt > 0—the 
entropy increases without a change in the other macroscopic parameters. This can 
be accounted for by 


dU dV —pwdN 
dS > ef u l 


or again, for T > 0, 
dU < TdS— pdV +udN. 


The equations for reversible processes become inequalities for irreversible ones, if 
we stay with fixed dt > 0. 


6.4 General Theorems of Thermodynamics 563 


6.4.2 Mechanical Work and Heat 


For fixed particle number (dN = 0), we now consider the inequality 
dU < T dS — pdV 


somewhat more deeply, thus allowing for irreversible changes of state. We think, 
for example, of a gas with pressure p in a cylinder with (friction-free) mobile pis- 
tons. In order to reduce the volume (dV < 0), we have to do work 8A = — p dV 
on the system. This energy is buffered in the gas—its pressure increases, because 
the molecules hit the walls more often. Alternatively, a spring might be extended or 
compressed. Instead of 8A = —p dV, we may also take 8A = (+) $`, Fk dx* with 
generalized coordinates x* and associated generalized forces F;. The sign has to be 
adjusted to the relevant notion. 

The work 8A is not generally a complete differential, because heat is also trans- 
ferred. Even in a cycle process, i.e., going through different states before returning 
to the initial state, f 8A does not generally vanish. If it did, this would be a sign of 
a complete differential dA, or a state variable A, whence the integral f dA would 
depend only on the initial and final points of the path and not on the path in-between. 

We know this situation already from mechanics (p. 56). Only for $ F - dr = 0 can 
we introduce a potential energy—Lorentz and frictional forces are situations where 
this is not possible. At least the Lorentz force (see Sect. 2.3.4) can be derived from a 
generalized potential energy q (® — v - A), or q v, A“ if, in addition to the position, 
we also allow the velocity as a variable, provided that there is no frictional force. As 
is well known, this leads to heat, our subject here. 

The internal energy U also increases if we supply energy without changing the 
volume V. Here the temperature does not even need to increase notably (latent heat). 
Then, e.g., at the normal freezing temperature of water, we need a melting heat of 6 
kJ/mole to melt ice. This is often written in the form (H20) = [H20] + 6 kJ. If the 
solid phase is set in angular brackets, the liquid in round, and the gaseous in curly, 
then we have (per mole) 


(...) = [...] + melting heat, 
{...} = (...) + vaporization heat , 
{...} = [...] + sublimation heat . 


Here, we may neglect the volume change for melting, but not of course for vaporiza- 
tion, which is why there are tables, e.g., [8], listing the vaporization enthalpy, i.e., 
the energy difference for constant pressure. We shall return to this in Sect. 6.4.4. 

If we set 8Q for the amount of heat in an infinitesimal process, the energy con- 
servation law for dN = 0 takes the form 


dU =8Q0+58A, with dU =0, for closed systems. 


564 6 Thermodynamics and Statistics 


This important equation is called the first main theorem of thermodynamics. Here, 
irreversible processes are also permitted. The essentially new aspect compared to 
mechanics is the kind of energy, i.e., “heat”. 

If we restrict ourselves to reversible processes, the comparison with the first men- 
tioned equation dU = T dS + 8A supplies the second main theorem of thermody- 
namics, VIZ., 

8 Qrev 


T 


SOry=TdS, or dS= 


After our rather detailed investigation of the entropy, this is almost self-evident, as 
soon as the notion of the amount of heat has been clarified by the first main theorem. 

While the entropy for reversible 5Q,e, may increase or decrease, depending on 
its sign, for irreversible processes it always increases. We have already investigated 
in detail the entropy law “dS/dt > 0 for closed systems” as a further constituent 
of the second main theorem. Therefore, all the main theorems of thermodynamics 
have been explained sufficiently—we have already discussed the zeroth and third in 
Sect. 6.3.8. 

Note that, using the second main theorem, a thermometer can be gauged, which 
is a problem, according to p. 559. In particular, by the second main theorem, the 


equation 
8 rev 
f dS = f Q =0 
T 
holds for a cycle. 


The Carnot process appears in the (S, T) diagram in Fig. 6.16 as a rectangle with 


o=4 Q, Q- E 6 


ds = — — = = =o e 
k E i. a: 


Hence, via the reversibly exchanged amounts of heat, the temperature can be mea- 
sured in arbitrary units—the discussion in Sect. 6.3.8 did not reach this far. 


Fig. 6.16 In the Carnot F 

cycle, the amount of heat Q4 / F. 
Q+ is reversibly taken in at 
the temperature T4} and the 
amount of heat Q_ is 
reversibly taken out at the 
temperature T_. No heat is 
exchanged in-between, and 
the total work taken in is 
Q+ — Q-, equal to the 
enclosed area in the (S, T) 
diagram. For a more general 
cycle, see Problem 6.25 


6.4 General Theorems of Thermodynamics 565 


The Carnot cycle is the ideal of a steam engine. In the combustion chamber, an 
amount of heat Q, is taken in at the temperature T,, and in the condenser, an amount 
of heat Q_ is taken out at the temperature T_ and given off to the cooling water 
(usually also at intermediate temperatures, which is not convenient). The difference 
Q+ — Q- = $80 can at most be converted to exploitable work — ¢ 5A, the energy 
remaining conserved for cyclic systems on the time average, and always for closed 
systems. The ratio of this work to the gained (input) energy Q+ is the thermodynamic 
efficiency n of the machine. (Modern power plants can reach n > 45%, James Watt 
had n ~ 3%, and its predecessors, e.g., Thomas Savery, a tenth of it.) According 
to Carnot, this efficiency has an upper limit nc < 1, because n = (Q+ — Q-)/Q+ = 
1—T_/T, and the cooling water (without energy input) cannot be cooler than the 
environment (and the fire cannot be arbitrarily hot). In reality, the efficiency is less, 
because heat is exchanged for intermediate temperatures and everything should go 
quickly, so changes are not only quasi-stationary. 

In essence, the steam engine converts a part of the disordered motion (at high 
temperature) into ordered motion (work)—the energy is thereby changed from many 
degrees of freedom to a few. Nevertheless, the total entropy does not decrease, 
because it moves heat from the fire into the cooling water, and there the entropy 
increases more notably. 


6.4.3 State Variables and Complete Differentials 


State variables characterize a state, e.g., energy U, particle number N, and volume 
V are state variables in thermodynamics. They may be taken as functions of other 
state variables (x1, ...) = x. Then, 


of 
ax! 


df = f(x+dx)— f=) di and fas=0. 


This quantity d f is called a complete (or total or exact) differential. 

But not every infinitesimal quantity ðf is a complete differential df. We shall 
write ô f for all differential forms of the kind encountered in the variational calculus, 
while many use only d f , even for non-exact differentials. Then, 


f=) a dx! 


is a complete differential only if a; = df/0x' for all i, and on all simply-connected 
regions, 


a6: = dar , foralliandk. 
axk = ax! 


566 6 Thermodynamics and Statistics 


Thus 0? f/dx*dx! = 3? f/dx'dx* is required, but the partial derivatives only com- 
mute if they are continuous. If this necessary and sufficient constraint on a complete 
differential is violated, then the infinitesimal quantity 8 f is a “non-exact differen- 
tial”. Then the path becomes decisively important for the integration. For example, 
Sf =ax7! dx + Bx dy is not exact, since da,/dy = 0, but day/dx = P. If we inte- 
grate here from (1, 1) to (2, 2), going parallel to each axis in turn, then the path via 
(2, 1) yields f èf = æ ln 2 + 28, while the path via (1, 2) yields f èf = 8 + a1n2, 
whence $ òf #0. 

In three dimensions, this necessary and sufficient constraint for a complete dif- 
ferential can also be expressed by 


Vxa=0. 


In mechanics, therefore, a potential can only be introduced for curl-free forces (see 
p. 56). 

Note that, always in two dimensions, and in special cases in higher dimensions, 
an incomplete differential can be made into a complete differential by multiplying 
by a suitable function (the integrating factor, also called Euler’s integrating factor), 
which then becomes a state variable. The integrating factor for Qrey is T~!. 

Changes of state are named after the conserved variable: 


dS =0 isentropic, dV =0_isochoric , 
dT =0 isothermal, dp=0 isobaric. 


For reversible processes, isotropic means the same as adiabatic, i.e., without heat 
exchange. With the ideal Carnot process, the states change either isotropically or 
isothermally, so in the (S, T) diagram, it is easier to represent than in the (V, p) 
diagram. 


6.4.4 Thermodynamical Potentials and Legendre 
Transformations 


For the internal energy U, on p. 561, we derived the differential form 

dU = T dS — pdV + daN 
for reversible processes. Consequently, the state variables S, V, and N, the so-called 
natural variables, are particularly well suited as independent variables for the internal 


energy. We can in particular obtain the associated intensive quantities T, p, and u 
from the internal energy U by differentiation: 


(Swat (ey =P: GE) =e: 


6.4 General Theorems of Thermodynamics 567 


Likewise, the potential energy Epot may be differentiated with respect to the general- 
ized coordinates x“, which delivers generalized forces 0 Epot/ ax* = —F,. Therefore, 
the internal energy U is one of the thermodynamic potentials. 

As already mentioned for the vaporization heat on p. 563, it is often appropriate 
to replace the extensive variables S, V, or N by their associated intensive parameters 
T, p, or u, respectively, if, e.g., the temperature and pressure are kept fixed, but not 
the entropy and volume. 

We have already encountered such transformations of variables in mechanics, 
where we replaced the Lagrange function L(t, x, x) by the Hamilton function 
H(t, x, p) by p = 0L/dx. This is made possible using a Legendre transformation: 


— =C, or dA=CdB, 


0(BC—A) _ 


d(BC—A)=BdC, 
= ( ) aC 


B. 
If we thus want to replace the variable B by C = 0A/0B, then we take BC — A 
instead of A. So, when H = xp — L was chosen, we obtained 0H /dp = x. 

We now introduce the following thermodynamic potentials: 


U internal energy , 
H=U+pV enthalpy , 
F=U-TS (Helmholtz) free energy , 


G=H-—TS=F+pV_ free enthalpy (Gibbs free energy) , 


to obtain new natural variables with their differentials: 


dU =+TdS—pdV+udn, 
dH =+TdS+Vdp+udN , 
dF =—-SdT—pdV+udn, 
dG =—-SdT+Vdp+udnN. 


Clearly, we could also introduce four further grand canonical potentials U — uN, 
H — uN, F — uN = J, and G — uN. Of these, we shall also need 


dJ =—SdT — pdV —- Ndu , 


from Sect. 6.5.2 onward. However, we often consider systems with a given particle 
number. Then we have dN = 0, the four equations are simplified (the chemical 
potential no longer plays a role), and the grand canonical potential becomes obsolete. 
If, on the other hand, further variables are important, then additional terms appear, 
e.g., with electric or magnetic fields. 

The expression thermodynamic potential is, however, only justified if it is taken as 
a function of its natural variables, thus, e.g., U(S, V, N). Otherwise, simple partial 
derivatives do not result. Then according to p. 43 and this section, 


568 6 Thermodynamics and Statistics 


Coe Z (Fen we CORES TO a (S), i 


and the last mentioned derivative has still to be determined. We shall return to this 
in the next section. 

From the Legendre transformation equations above with (dC /dB) (0B/dC) = 1 
for C = ðA/ðB, it is clear that 07A/@B? - 8?(BC—A)/dC* = 1. Taking the first 
equation, e.g., with A = U, B = V, and C = —p for fixed S, this delivers 


2 2 2 2 
i ha, = (ss), (G), 
2 2 2 2 
= EREN E SES), f 


each for fixed particle number N. Here we have written first the negative and then 
the positive factor, and we shall encounter such sign rules in the next section. 


6.4.5 Maxwell’s Integrability Conditions and Thermal 
Coefficients 


The thermodynamic potentials are state variables, and therefore integrability condi- 
tions are valid: their mixed derivatives do not depend upon the sequence of differ- 
entiations (except for phase transitions). We shall use this now and always keep the 
particle number fixed. Then, with f(x, y) instead of 3? f/ðx dy = 3? f/dy Ax, we 


write more precisely A af j af 
Galas) = AA 


These imply four integrability conditions, depending on which pair of S, T, V, and 
p is taken as the natural variables: 


dU =+ T dS — p dV - (3) =E) 
dH =+TdS+Vdp TCO COR 
ar=-ser-pav  - (8), =-8), 
dG =— SdT +V dp p,- E 


Here derivatives of p and V with respect to S and T are related to the “inverse 
derivatives” of S and T with respect to p and V. Here the partner is always kept fixed: 


6.4 General Theorems of Thermodynamics 569 


p and V form one pair, S and T the other. For the derivative 0p/0S = (dS/dp)“!, 
there occurs a minus sign. For all four derivative pairs, we shall now introduce 
abbreviations. 

The derivative (0p/dT)y is the pressure coefficient. It is denoted by $, but note 
that £ is often used for (kT)~!. It is related to p by the thermal stress coefficient 
a, = B/p, and related to the volume derivative (0V/dT), by the thermal expansion 
coefficient a: 


lyaV 1 as F ; 
a= ( ) = ( ) expansion coefficient , 
V\OT/p V ap/T 
ð as 
B a) Ta (a r pressure coefficient . 


The derivative (0T/0V)s in the first pair —(0p/0S)y = (0T/0V)s, now referring 
to p. 43, can be traced back to 


(e= -a)r e Gy. 


and the second in a corresponding manner to 


=), hr He), 


Here the derivatives 0S/0T are related to the heat capacities. We avoid the notion 
of specific heat (heat capacity/mass), because in the next section we divide by the 
particle number N instead of the mass, which is theoretically more convenient: 


as 0H . : ; 
Cp = r( ) = ( ) isobaric heat capacity , 
p p 


Cy = P(r) = (Gr) ‘ isochoric heat capacity . 


Besides these, we also introduce the compressibilities: 


1/aV 
Kr= -=> (—) isothermal compressibility , 
V \ðp/T 
1 3V : a. fi f are 
ks= -= (—) adiabatic (isentropic) compressibility . 
V\dp/s 


The signs for the heat capacities and compressibilities were chosen such that none 
of the four coefficients is negative. According to p. 559, we have in particular 
(OU/0T)y > 0 with (AU) > 0, and according to p. 560, (0V/dp)7 < 0 with 
(AV)? > 0, whence Cy > Oandxy > 0. In addition, we shall soon see that Cp = Cy 
and Ks = (Cy/Cp) KT. 


570 6 Thermodynamics and Statistics 


The expansion coefficient œ and the pressure coefficient 6 are mostly positive, 
but they can both be negative (e.g., in water at the freezing temperature). However, 
at least their product is always positive. 

The adiabatic compressibility can be determined from the sound velocity c and 
the mass density p. In the case of sound, there is a force density — V p, and therefore 
the impulse density has the modulus dp/c. It is equal to the momentum density c dp. 
Consequently, c? = dp/dp holds. Here the entropy is conserved, because there is no 
time for heat exchange. With p (0p/dp)5 = V7! (€p/dV!)s = —V (dp/dV)s5 = 


ks—!, we see that «s, p, and c? are actually connected: 


The thermal coefficients for fixed intensive quantities are thus rather easy to mea- 
sure, including the expansion coefficient œ and the heat capacity C, for fixed pressure, 
as well as the isothermal compressibility «7. However, the pressure coefficient 6 and 
the heat capacity Cy for fixed volume are not. Therefore, the following three relations 
are helpful: 


e Firstly the equation 


For its proof in (dp/dT)y, we need only swap the fixed and the altered variable, 
according to p. 43. 
Secondly, the equation 


Cp = KT 
Cy E Ks ` 
The left-hand side is equal to (0S/dT), (0T/dS)y, and, according to p. 44, we 


may swap the pair (S, T) with the pair (p, V) to obtain the right-hand side. 
The third equation 


Cp — Cy = TVap 


follows immediately (as a product T - 6B - Va), according to p. 43, from 


Cone Got Gane 


With aB = a?/xr > 0, we see that œ and £ have equal sign. Independently of this 
sign, we clearly have Cp > Cy and kr > ks. Ten derivatives of the potentials can be 
traced back to expansion and pressure coefficients in addition to T, S, p, and V (the 
remaining thermal coefficients also occur in other derivatives): 


6.4 General Theorems of Thermodynamics 571 


Gee PH 8s) r 
(a he a-env= -w(F5),=-F Ge: 
(Gr),= Sa av(57),. 
Gre SHRP OG) 


The first of these equations was already discussed on p. 567. The remaining ones 
follow in a similar way (Problem 6.34). 


6.4.6 Homogeneous Systems and the Gibbs—Duhem Relation 


How do the different quantities depend on the number of particles N? To answer 
this question we restrict ourselves now to particles of one sort and always assume 
homogeneous systems: all adjustable parameters have the same value everywhere, 
such that everything is in local equilibrium. 

As mentioned on p. 552, state variables are said to be extensive if they are propor- 
tional to the number of particles, e.g., S, V, and the thermodynamic potentials U, H, 
F, and G. In contrast, in equilibrium, intensive state variables have the same value 
everywhere, e.g., T, p, and u are intensive state variables. Except for the tempera- 
ture, all extensive quantities will be denoted with upper case letters and all intensive 
ones with lower case letters. 

Of course, we can also divide the extensive quantities by the particle number and 
then arrive at intensive quantities. We denote them by the corresponding lower case 
letters—the only exception is the temperature—and then we have no other extensive 
quantities than N: 


This separation is particularly convenient, if in addition to N only the intensive 
quantities T and p occur as independent variables, hence the natural variables of the 
free enthalpy G. 

If the weight of a particle (molecule) or the molecular weight M, is known, 
then a scale suffices for the determination of the particle number N = M/(M,u) 
of a macroscopic probe, where u = b of the mass of !?C is the atomic mass unit 
(atomic mass constant) (see Table A.3). Therefore, “specific” quantities, i.e., divided 
by the mass, are normally preferred, e.g., the specific heats rather than the heat 
capacities/particle. (But note that the specific weight gives the ratio M/V.) 

It is common to refer to a special particle number, namely the Loschmidt number 
NL. It corresponds to a mole, i.e., M, gram of the substance. Note that the Avogadro 


572 6 Thermodynamics and Statistics 


constant N, differs only by dimension: Na = N,/mole. (This constant was intro- 
duced by Avogadro in 1811, but the value of this number was first determined by 
Loschmidt in 1865.) Then, for example, on p. 563, the melting heat was given in 
kJ/mole. It is necessary for Ny molecules. The product of Na and the Boltzmann 
constant k is called the gas constant and denoted by 


R= Nak. 


Quantities referring to one mole are common in physical chemistry and are called 
molar quantities. To obtain these, we multiply the quantities valid for a single 
molecule by the Avogadro constant N4. 

The chemical potential jz is the adjustable parameter corresponding to the particle 
number N. According to p. 567, it is obtained from any of the four thermodynamic 
potentials by differentiation with respect to N, if the other natural variables are 
kept fixed. The free enthalpy is particularly suitable, because it depends otherwise 
only on intensive quantities: u = (9G/ƏN)rp . Hence, for homogeneous systems 
in equilibrium, G = N g(T, p) clearly implies u = g(T, p), and thus the famous 
Gibbs—Duhem relation 

G=uN, 


which will prove to be extremely useful. For homogeneous systems, with 
G=H-TS=F+pV=U-TS+pV, 
it yields 
H=TS+wuN, F=-pV+uN, U=TS—pV+uN. 

For homogeneous mixtures of different sorts of particles, uN is to be replaced by 
>>; Hi Ni, as shown on p. 587. 

Note that the chemical potential always decreases with increasing temperature, 
because dF = —SdT — pdV + dN implies the integrability condition 


(0u/0T)y.n = —(9S/ƏN)r,v = =s (T, V), 


and the fact that the entropy is never negative. 


6.4.7 Phase Transitions and the Clausius—Clapeyron 
Equation 


We shall now investigate the equilibrium condition for the exchange of particles, 
energy, or volume, in particular the phase equilibrium. As is well known, the same 
molecules may exist in different phases (aggregation states): solid, liquid, gaseous, 


6.4 General Theorems of Thermodynamics 573 


Fig. 6.17 For first order phase transitions, the first derivative of the free enthalpy G(T, p) makes 
a jump, here indicated by the dashed red line. The structure indicated by the dotted blue lines 
would have higher G than the stable phase (continuous green lines). Here 0G/0T = —S < 0 and 
dG/dp = V > 0 always hold 


etc. In Sect. 6.3.8 we derived the constraints T} = T_, py = p_, and wy = u. 
According to the Gibbs—Duhem relation, we thus also have 


g+(T, p) = g_(T, p). 


This equation defines a coexistence curve p (T) in the (T, p) plane, where the two 
phases are in equilibrium (see Fig. 6.17). Away from this curve, there is only the one 
or the other phase, namely the one with the lower free enthalpy, as will be shown 
in Sect. 6.4.9. Three phases may exist in simultaneous equilibrium only at the triple 
point Tọ, Py. This is the meeting point of the three branches corresponding to the 
phase equilibria for melting, vaporization, and sublimation, or those of other phase 
transitions. 

For the coexistence curve p (T), the differential equation of Clausius and Clapey- 
ron holds. Along this curve, we have dg, = dg_. Hence dg = —s dT + v dp leads to 


—s, dT + v, dp = —s- dT + v_dp, 
and this in turn implies the Clausius—Clapeyron equation: 


dp  s— 8s 
dT) v} — v i 


The entropy change S, — S_ times the transition temperature T is equal to the 
transition heat for the phase change: melting, vaporization, or sublimation heat (see 
p. 563). For these heats, we are dealing with transition enthalpies, since we then have 
to care for Ap = 0 and have therefore T AS = AH: 


We usually have dp/dT > 0, but there are nevertheless also counter-examples, 
for instance, for the transition ice —> water with AH = 6.007 kJ/mol and AV = 
—0.0900 cm?/g. 


574 6 Thermodynamics and Statistics 


The different substances in a mixture do not usually transform at the same tem- 
perature. If we have, for example, two metals mixed in a melt and then cool it down, 
without altering the pressure, then often only one of the metals will freeze, or at least 
with a mixing ratio different from the one given for the melt. The mixing ratio of the 
melt also changes, and along with it its transition temperature. On further cooling, 
the two metals do not necessarily segregate. The lowest melting temperature may 
occur for a certain mixing ratio of the two metals, hence higher for neighboring mix- 
ing processes. This special mixture is called eutecticum: it freezes (at the eutectic 
temperature) like a pure metal, while for other compositions, inhomogeneities are 
formed in the alloy. 

The mixing entropy is important for such mixtures, where we are concerned, for 
example, by things like the lowering of the freezing point and raising of the boiling 
point of water by addition of salts. This will be discussed in Sect. 6.5.5, because only 
there will we be able to determine the temperature change. 


6.4.8 Enthalpy and Free Energy as State Variables 


The last two sections have shown the utility of the notion of free enthalpy G for 
homogeneous systems and for phase transitions. In particular, it is conserved for 
isobaric—isothermal processes, just as the internal energy is for isochoric—isentropic 
processes. In contrast, for phase transitions with volume changes, and fixed pressure, 
the enthalpy H (not the free enthalpy) is important for the transition heat, in addition 
to the internal energy and also the (mechanical) work p dV. 

The enthalpy is also important for the isentropic flow of frictionless liquids 
through tube narrowings and widenings: here neither work nor heat is exchanged 
through the wall of the tube, but pressure and temperature vary with the tube cross- 
section. The idea is to follow a mass element M in a stationary flow, and in addition 
to its internal energy U, to account also for its collective kinetic energy IMD, 
work pV, and potential energy Mgh in the gravitational field of the Earth. Only the 
sum of the enthalpy H = U + pV and the center-of-mass energy 5M T + Mgh is 
conserved along the path. Here the pressure changes with the tube cross-section, as 
is easy to see for incompressible liquids because the continuity equation requires 
V-v=0. The smaller the tube cross-section, the higher the collective velocity v 
parallel to the wall, and the lower the pressure on the wall. The Bernoulli equation 
(Daniel Bernoulli, 1738) can be applied here. According to this, 5 pv’ + p+ pgh 
is conserved along the path, where the pressure dependence of the internal energy 
(for fixed volume) is neglected compared to the other contributions, along with the 
friction (viscosity). 

The enthalpy is conserved in the throttling experiment of Joule and Thomson. 
Here a suitable penetrable obstacle (“a piece of cotton wool”) ensures a pressure 
difference between the high and low pressure regions, and here again there is no 
heat exchange with the environment. The kinetic energy of the center-of-mass is 
negligible (v = 0), and therefore the enthalpy is conserved. 


6.4 General Theorems of Thermodynamics 575 


For real gases in the throttling experiment, the temperature changes (Joule— 
Thomson effect). According to p. 43, we have 


(i _ ( oT ) (0) 
ap H ~ 0H p ap T i 
Then, according to Sect. 6.4.5 with dH = T dS + V dp, we have 


Note that C, and V are extensive quantities and for the Joule-Thomson coefficients 
only their ratio is important. Ideal gases have aT = 1 (as shown on p. 582). Hence the 
throttle experiment with ideal gases proceeds along an isotherm. But for real gases, 
aT may be larger or indeed smaller than 1. (For low temperatures the attractive 
forces between the molecules are the stronger ones, so cooling by decompression 
is possible, while at high temperatures the repulsive forces are the stronger ones, 
so the gas heats up under decompression. However, under normal conditions, only 
hydrogen and the noble gases have wT < 1.) In the (T, p) plane the two regions are 
separated by the inversion curve. We shall also investigate all this more precisely for 
a van der Waals gas (Sect. 6.6.2). 

It is not the enthalpy, but the free energy F that is important for isothermal, 
reversible processes, e.g., if the system is coupled to a heat bath. With dT = 0, we 
have dF = —pdV. Thus the free energy F changes here by performing work. The 
free energy is the part of the internal energy which, for an isothermal, reversible 
process, can be extracted, while the rest U — F = TS is the energy bound in the 
irregular motion. In contrast, for an adiabatic isolated system, dS = 0 holds, and 
thus —pdV = dU. 

A very important example is the energy density of electromagnetic fields. Accord- 
ing to electrostatics, a potential energy 5 [dV p®= 5 {dV E-D is associated 
with a charge density p and a potential ® (see Sect. 3.1.8), while the magnetic field 
is associated with the energy + f dV j- A= + f dV H - B (see Sect. 3.3.5). Here itis 
assumed that temperature and volume remain unchanged by (quasi-statically) bring- 
ing the charges and currents from infinity to their respective positions—only after- 
wards can the charge and current density change. Therefore, with 5 (E-D+H.-B), 
we have identified the density of the free energy. 

We can also arrive at the free energy if we derive the state variables from the 
canonical partition function Zc. Sections 6.3.6 and 6.3.8 give in particular S = 
k(n Zc + AgU), with Ag = (kT)~!, and thus —kT ln Zc = U — TS = F: 


—F 
F=-kTlnZc, or Zc = exP 77 ` 


To compute this, T, V, and N are normally given. The conjugate variables follow 
using dF = —SdT — pdV + u dN: 


576 6 Thermodynamics and Statistics 
T (=) _ (=) =+(2*) 
~ Natl? PO Nav rw? NON ry’ 

The other thermodynamic potentials then result from 


U=F+TS, G=F+pV, H=U+pvV, 


but the internal energy U, according to pp. 554 and 559, thus comes directly from 


y ——2'nZc pr? e5 l 
OAR oT /V.N 


We can thus derive the thermal equation of state for p, V and T, and likewise the 
canonical equation of state for U, F, H and G, from the canonical partition function. 


6.4.9 Irreversible Alterations 


In this section, we have considered only reversible changes of state, even though at 
the beginning, in Sects. 6.4.1 and 6.4.2, we also allowed for irreversible ones. If we 
fix dt > 0 as there, then we generally have 


dU < +TdS— pdV +udN , 
dH < +TdS+Vdp+pndN, 
dF < —SdT — pdV + udN , 
dG < -—SdT+Vdp+yudn. 


The first inequality was already proven in Sect. 6.4.1. The second follows from there 
with H = U + p V, the third with F = U — T S, and the fourth from the third with 
G=F+pV. 

The last two inequalities are particularly important, because it is not the entropy 
changes dS that are of interest, but the temperature differences dT. If we keep, e.g., 
T, p, and N fixed for an irreversible process, then the free enthalpy nevertheless 
decreases, i.e., dG < 0, because the system was not yet in equilibrium. Stable equi- 
librium states are the minima of the thermodynamic potentials. This means the free 
energy for fixed T, V, and N, and the free enthalpy for fixed T, p, and N. Of course, 
in each case, the entropy is also then as large as possible. We have already made 
use of this for the phase transition (Sect. 6.4.7): only the phase with the smaller free 
enthalpy is stable for given T and p. 


6.4.10 Summary: General Theorems of Thermodynamics 


We have derived relations between the macroscopic state variables T, S, p, V, u, N, 
U, H, F, and G, including equations for equilibrium states and reversible processes 


6.4 General Theorems of Thermodynamics 577 


and inequalities for non-equilibrium states and irreversible processes. This all follows 
from the main theorems of thermodynamics, which can be justified microscopically 
or required axiomatically, but which in either case must be tested by experience. 
Basic for the first and second main theorems is the relation 


dU < TdS — pdV +udN, for dt>0, 


where U has the natural variables S$, V, and N. This implies, for example, T = 
(0U/0S)y,n and p = — (9U /ƏV)s,y as well as Maxwell’s integrability condition 
(0T/0V)5, ny = —(0p/0S)y,n. Other thermodynamic potentials like F = U —T S, 
H = U + pV,andG = H —T S follow from Legendre transformations (with other 
natural variables) and deliver further similar constraints. 


6.5 Results for the Single-Particle Model 


6.5.1 Identical Particles and Symmetry Conditions 


In the last section, we presented macroscopic thermodynamics and derived general 
relations between observable quantities. Now we want to restrict ourselves to equi- 
librium states and special cases with known partition functions. Then according to 
p. 576, we may derive all thermal and canonical equations of states. 

Identical particles without correlations are particularly simple. Then the same one- 
particle potential acts on all particles, and the probability distribution of the many- 
particle problem splits into a product of one-particle distributions. These depend on 
the one-particle states or on the cells in phase space of each individual particle (u- 
space). We order them with respect to their energy e;, and degenerate ones in some 
arbitrary way. 

Now it is suggestive to assign to every particle its state, and thus fix the many-body 
state. This leads to Maxwell—Boltzmann statistics, although it contains an internal 
contradiction. In particular, we have assumed the ability to distinguish between the 
individual particles, otherwise we cannot decide how a given particle behaves in the 
course of time. Then distinguishing features are necessary, and therefore the particles 
cannot be completely identical. 

This contradiction does not occur in quantum theory, because there we have to 
account for the exchange symmetry. Consider two particles in the states |œ} and |6). 
For bosons, only the symmetric state 


læ, B)s = +1B, @)s œx læ) |B) + |B) læ) 


is permitted, and for fermions, only the antisymmetric state 


la, ja = —|B, @)a % |ax) |B) — |B) læ). 


578 6 Thermodynamics and Statistics 


In both cases, the first particle occurs with the same probability in the state |æ} as in 
the state |6), and the second, of course, likewise. 

Two bosons may occupy the same one-particle state, but not fermions, because 
this contradicts the antisymmetry (Pauli principle). If n; is the occupation number of 
the ith one-particle state, then we have the occupation-number representation (see 
Sect. 5.3.5): 


bosons Iz)s = [öt 12,...)s with n; € {0, 1, ...}, 
fermions la = [|n1, N2,.--)a with n; € {0, 1}. 


Correspondingly, for bosons, we have Bose—Einstein statistics, and for fermions, 
Fermi—Dirac statistics. 

In the classical Maxwell-Boltzmann statistics, several particles may occupy the 
same one-particle state. However, there the many-body state does not have to be 
symmetric under particle exchange. There are classically more states (by the factor 
N!/nı!...) than in Bose—Einstein-statistics, because classically each permutation 
counts as a new state. If all states are occupied just a little bit (all n; = O or 1), then 
according to Stirling’s formula, this produces an additional term k In N! ~ Nk InN 
in the entropy S = k ln Zyc. This addition does not increase in proportion to N, 
even though it has to be an extensive variable. This contradiction, occasionally called 
Gibbs’ paradox, can only be removed by replacing Z — Z/N! in classical statistics. 
This leads to the corrected Boltzmann statistics. 


6.5.2 Partition Functions in Quantum Statistics 


This is best evaluated for the grand canonical ensemble, for which the energy and 
particle number are given only on average. For a sharp particle number, the calculation 
is rather involved (see the textbook by Reif in the reading list on p. 620), and soluble 
only with an approximation, which is in effect the transition from the canonical to 
the grand canonical ensemble. Note that the volume should also be given, because 
the one-particle energies depend on it. 

If the ith one-particle state contains n; particles of energy e;, then according to 
the single-particle model, we have 


N=)on, and E=} n; ej, 


with n; € {0, 1, 2, ...} for bosons and n; = 0 or | for fermions. Note that N and 
E do not stand for the mean values here. For the grand canonical partition function 
Zoc = trlexp{—(E — wN)/kT}], with z= {n1, no, ...}, we obtain 


— 0, mi (i — u) 
Zac = > exp MG 


{n1, n2, ..-} 


6.5 Results for the Single-Particle Model 579 


The exponential function of a sum is equal to the product of the exponential functions: 


Pa > Il exp Nj c Tei 


{n n2} i 


In each term, the first the factor is exp{—n,(e,; — u)/kT}, then we have the factor 
with i = 2, and then the remaining ones, whence we may write: 


Zoc =| [> exp HEP, 


For example, with a = exp{—(e, — )/kT} and b = exp{—(e2 — w)/kT}, we have 

initially Zgc = a°b® + a?b! + ---+ a'b? +a!b! +.--.+---, but this sum of prod- 

ucts may be written as product of simple sums Zgc = (a? +a! + ---)(b° + b! + 
=)? 

For bosons, we thus obtain the geometric series of {1 — exp(—(e; — w)/kT)}"', 
where the chemical potential u keeps the average particle number finite, and thus 
the geometric series converges. For fermions, on the other hand, we arrive at the sum 
1 + exp(—(e; — )/kT). Therefore, the result may be reformulated as 


Ges TG exp 2a 


i 


or again, 


In Zoc = + a(l F exp eM) x 


where the upper sign holds for bosons and the lower one for fermions. We will also 
keep to this notation in the following. 

According to p. 556, the natural variables of the grand canonical partition func- 
tion are Àg, Aw, and V, or according to Sect. 6.3.8, T, u, and V. Here, according 
to Sect. 6.3.6, the entropy S is given by k ln Zac + (U — uN)/T. Consequently, 
—kT ln Zgc = F — uN holds, and by the discussion on p. 567, this is the grand 
canonical potential J: 


J = -kT InZgc = F-—uN=G-—pV-wN, 


with 
J =-—SdT — pdV —- Ndu. 


Using Zgc(T, V, u), the quantities S, p, and N may be derived immediately, and 
then also the other potentials U, H, F, and G may be determined. According to the 
Gibbs—Duhem relation, homogeneous systems have G = uN and thus J = — pV. 


580 6 Thermodynamics and Statistics 


6.5.3 Occupation of One-Particle States 


So far we have viewed the grand canonical partition function as a function of T, V, 
and n, but in the single-particle model, the energies {e;} replace the volume. These 
depend not only on V, but also on the average one-particle potential. Therefore, from 


ad alnZ, dlnZ 
a ed Ce rom a | carr 
OU’ T, {ei} ðu T,{ei} : dei T {ekzi} h 


we deduce the average occupation number of the ith one-particle state as 


(ni) = (=), mene (OP GFL) 


One-particle states of high energy (e; >> u + kT) are thus barely occupied. In addi- 
tion, as required by the Pauli principle, 


0 < (nj) < 1, for fermions, 


while for bosons (n;) may be greater than 1. But for the latter, due to the constraint 
N > (ni) = 0, the chemical potential jz is restricted to y < mine;, and so is never 
positive for eg = 0. In a grand canonical ensemble and for e; < ej, for both sorts of 
particles, we have (n;) > (nj). 

Since exp{(e; — )/kT} = (n;)~' + 1 and with the average occupation numbers 
(ni), the partition function Zgc is given by 


fi Zac= FEU l =)= D a =+ n (1 (n:)). 


Using this for the ith one-particle state, we may also give the probability for its 
occupation by n particles. Here we write n instead of n;. The partition function is 
clearly equal to (1 + (n))*! and from p = Z7! exp{—(E — uN)/kT} (see p. 555 
and Sect. 6.3.8), it follows that 


_ explan (e: —u)/kT} ____(n)" 
(1 (n)! d E (n)! 


For bosons, since pn+1/Pn = (n)/(1 + (n)) < 1, the state without particles always 
has the highest probability, and that with n ~ (n) is not special at all. The situation 
is quite different for fermions: for them, 0 < (n) < 1 and in addition pọ = 1 — (n) 
and p; = (n). 


6.5 Results for the Single-Particle Model 581 


The relation U — uN = 3°, (ni) (e; — u) = kT YX, (ni) In ((n;)~! £ 1) implies 


S=kinZgc+(U —pUN)/T , 


whence 


S = tk oe + (m;)) Æ (ni) In =) 


=-k (m) i) F (1+ (mi)) In (1E (nj) . 


Since x Inx = 0 for x = 0 and x = 1, the unoccupied states do not contribute to 
the entropy, and likewise for fermion states with (n;) = 1. This can also be justified 
by considering the uncertainty of the occupation number because, for the squared 


fluctuation of the particle number in the ith one-particle state, using Ay = —u/kT 
and 
O(n; O(nj Hod kT 
(an)? = HM) pep d ae T 
OAN ðu lexp{(e; — u)/kT} F 1] 


we obtain the noteworthy result (see Fig. 6.18) 


(An;)* = (ni) (1+ (ni) . 


This vanishes when (n;) = 0 and also for fermions when (n;) = 1, while for bosons, 
when (n;) > 1, the error width is An; © (n;), not ./(n;), as would be expected 
classically. Note also that, for fermions with (n;) = 5. the error width is 5. 

With decreasing temperature the states of higher energy become ever more depop- 
ulated. In the limit T ~ 0, fermions only occupy one-particle states with e; < p, 
while the states above stay empty. Then we have a degenerate Fermi gas with 


Fig. 6.18 Occupation (n) (+An) 
number of the one-particle 
states as a function of 
(e—)/kT for bosons (red 
curve) and fermions (blue 


or 


curve). We also show : 

(n) + An (dashed curves) for 3 
bosons and for fermions. 

With (An)? = —0(n)/dAN 2r\\ 
and Ay = —u/kT, the Fermions \ 


uncertainty is greater, the 
more rapidly (n(x)) 
decreases. Note that the base 
line here appears shifted to 
negative values! 


582 6 Thermodynamics and Statistics 


u(T = 0) as the Fermi energy ep. We shall return to this in Sect. 6.5.6. With decreas- 
ing temperature, bosons crowd into the one-particle state of lowest energy eo. Their 
chemical potential for T ~ 0 is thus determined by the constraint (N) ~% (no), which 
yields u ~ eo — kT In(1 + (N)~!) © eg — kT/N. More on that in Sect. 6.6.6. 


6.5.4 Ideal Gases 


For high temperatures, a great many states are occupied with nearly equal probability. 
For (N) to remain finite, we must then have exp{(e; — )/kT} >> 1 for all i, and 
hence —u >> kT. But then Bose-Einstein and Fermi—Dirac statistics no longer differ 
because the exchange symmetry is no longer respected if all one-particle states are 
barely occupied. According to the above remarks, we then have 


-= In Zoo ~ Las “—*)* Sn) = (N). 


If we make use here of the Gibbs—Duhem relation for homogeneous systems, thus 
J = —pV, we obtain the Gay-Lussac law, which is just the thermal equation of state 
for ideal gases, viz., 

pV =NkT. 


Then using the results æ = V~'(0V/0T),.v, Kr =—V'(V/dp)r.y, B = a/kr, 
and C, — Cy = aBTV, we obtain 


a=—, Kr =—, b= Cp—Cyv= Nk. 


Hence for ideal gases with (9U /3V)r = —p + BT and (9H/ðp)r = (1—aT)V 
(see p. 570), both (9U/ƏV)r and (0H/dp)r are zero. For (reversible) isothermal 
processes in ideal gases, when the volume changes, the internal energy is conserved, 
and when the pressure changes, the enthalpy is conserved. Consequently, for ideal 
gases, there is no Joule-Thomson effect, something we commented on already on 
p. 575. 

Clearly, the canonical partition function for a particle may be extracted from 
the above-mentioned equation N ~ }_; exp{—(e; — )/kT}, and we denote this by 
Zc(1), whence Zc(1)/N is an intensive variable: 


u 

N = Zc(1 — 

c(1) exp iT 

The factor exp(u/ kT) is called the fugacity, and in physical chemistry, the absolute 

activity of the material. We shall soon determine Zc (1) for important examples, and 
hence also u via the Gibbs—Duhem relation G: 


6.5 Results for the Single-Particle Model 583 


yal Zeit 
u= irn and GS er, 
N N 


Hence we obtain the free energy F = G — pV = G — NKT if we also use the Gay- 
Lussac law. The internal energy 


U = F+TS =F —T(əF/ðT)y y = —T?(0(F/T)/8T)y.n 


yields 


dInZc(1 
U = NkT? (EEU s 


oT 


and the enthalpy H = U + NkT. For the entropy, we obtain 


= -(F)y =n {in a ee (Ae) | ` 


aT N aT 


Here we have required — u >> kT, and hence In Zc(1)/N > 1, but it is not necessary 
that it should be very much greater than 3, as can be seen from Fig. 6.19. The canonical 
partition function Zc(1) is determined according to the internal degrees of freedom 
of the given gas. 

For the ideal monatomic gases, up to rather high temperatures (1 eV =11 600 K), 
there is no internal excitation of the atoms (the electronic degrees of freedom are 
frozen), so what is important for e; is only the kinetic energy p;7/2m of their centers 
of mass. Here, according to p. 525, a particle confined to a cube of volume V = L? has 
the momentum eigenvalues p; = njfiz/L, where n; may have only natural numbers 
as Cartesian components, and not even negative integers. If we insert this into the 
canonical partition function and replace the sum by an integral, we obtain 


a —(nha/L 2 4 oo _ 22 2 
zo) = f nde (nha /L)" _ = | dase a, 
8 2m kT 8 Jo 2m kT L? 


and therefore, since fy dx x? exp(—ax?) = 4,/1/a3, 


kT j V 


ae a 


where the thermal de Broglie wavelength is defined by 


h 
V2nmkT © 


However, the Maxwell distribution for (h/mv) delivers twice this value (see Prob- 
lem 6.11), so the name is not quite satisfying. The result holds for high temperatures, 
and not only for a cube. For V >> 43, other restrictions deliver the same value for 
the partition function Zc(1). 


584 6 Thermodynamics and Statistics 


Fig. 6.19 The In ZK (1)/N = —p/kT 
single-particle model for 3 

ideal monatomic gases yields 
the equations mentioned in 
the text, if T >> To with 

kTo = 4r (N/V) 2 /2m. 
But for T ~ Tọ, the 
exchange symmetry 
contributes. The upper curve 
is for bosons and the lower 
curve for fermions. We 
return to this in Fig. 6.22 


nN 


bo 
w 
n 
(e>) 
N 
N 
N 


Consequently, for ideal monatomic gases, we find (0 In Zc(1)/dT)y = 3 T~ and, 


as expected by the equidistribution law (p. 559), 


Zc(1) 4 >) 
a) 


U=iNkT, H=$NKT, S=Nék(In 


Hence with Cy = (0U/0T)y,y = Cp — Nk and ks = kr Cy /Cp, we have 
Cy =3Nk, Cp=4Nk, and ks=ixp=ip'. 


If we relate to a mole, then according to p. 572, we have to take the gas constant R 
instead of Nk. 


For ideal diatomic gases, the molecules rotate and oscillate. As long as its moment 
of inertia does not change notably despite the oscillations, the canonical partition 
function of a molecule may be written as the product of the canonical partition func- 
tions for the the center-of-mass motion, the rotations, and the oscillations, disregard- 
ing electronic degrees of freedom, which do not contribute anything (as established 
above). 

At room temperature, in addition to the electronic excitations, the oscillations are 
also frozen. The rotations of diatomic molecules for constant moment of inertia © 
have the energy j(j+1) h?/20, and each level is (2j + 1)-fold degenerate due to the 
isotropy. Therefore, we have 


eo eg 
Zc rot(1) = È 2j +1) exp(—55 we) 
j 


We evaluate this sum again via an integral, and use the continuous variable 
1 
x= G+5) A/V2O0kT . 


For molecules containing two identical atoms, however, the states with odd angular 
momentum do not occur, and this halves the partition function. Without this factor of 


6.5 Results for the Single-Particle Model 585 
5 (thus in the case of non-identical atoms), with 1 dx 2x exp(—x7) = 1, we obtain 


kT h/20 kT he 


N 1 for kT > — 
Boe ar woe a = 


Zc rot(1) = 76 ` 


For sufficiently high temperatures, the product of the partition functions is 


Ze(1) = kT ( kT J” 
ON R20 \rh/2m ' 


and thus now (0 In Ze(1)/ðT)y = 3 T~!. For all diatomic molecules (of identical 


or non-identical atoms) and for sufficiently high temperatures, it thus follows that 


Zd) 7 
U=$NkT, H=4NKT, S=Nk(In 5), 


N 2 


This result does not contradict the equidistribution law, because for a diatomic 
molecule, the moment of inertia about the symmetry axis is then small compared 
to the other two, so this rotation is frozen. Therefore, for the symmetric top (see 
p. 145), we only have Hot = (P3 + p2/ sin? B)/2@. Each of the N molecules thus 
contributes to the internal energy 3 kT from the translational motion and also Z kT 
from the rotation. Note that the factor of p2 is not fixed but depends on £, but this 
does not affect the equidistribution law, as was shown by its proof on p. 559. We thus 
obtain 
Cy =3Nk, Cy=4Nk, and ks=32kr, 


with xr = p7', as for all ideal gases. 
These expressions are of course only valid for ideal diatomic gases as long as the 
oscillations are frozen. Otherwise, we must consider 


tl 
Zx vib(1) = Lat Grek D) _ exp(—5 hw/kT) B 1 


1—exp(—hw/kT) 2 sinh(hw/2kT) ` 


If this degree of freedom is fully thawed, i.e., kT >> hw, then this results in kT /hw, 
whence 


kT kT ( kT i 


Zc(1) = 
(D = go R20 \4nh2/2m 


Then, 


Ze) 9 
U=1NKT, H=$NKT, S=Nk(n ee): 


2 


and 


586 6 Thermodynamics and Statistics 


If the molecules consist of two identical atoms, then in fact the above-mentioned fac- 
tor 5 changes the expression for the state sum Zc (1) by a factor of 2, which modifies 
u only by Au = kT In2 and S by AS = —Nk1n2. Also unimportant according to 
the equidistribution law is whether the molecules consist of identical or non-identical 
atoms. 


6.5.5 Mixing Entropy and the Law of Mass Action 


Mixtures of several materials may be evaluated rather simply as long as no cor- 
relations have to be accounted for. To begin with, we consider a segregated equi- 
librium state, with the same temperature and pressure everywhere. Each part has 
its particle number N; corresponding to its volume V; and entropy S;(T, p, Ni). 
The total volume is V = 0, V;, the energy U = }_; Uj, and the entropy S = )°; Sj. 
If we now allow for a complete mixture with fixed U and V, then the entropy 
increases, because the number of accessible states increases with the volume. We 
restrict ourselves here to ideal gases. Then the chemical potential changes with 
u = —kT In (Zc (1)/N) and Zc(1) « V by —kT In (V/V;) = —kT In (N/N;), and 
the entropy S; by Nik In (N/N;). Consequently, the mixing entropy amounts to 


Ni 
Su = —k DT Niny > 0. 


The mixing is an irreversible process, because the entropy increases. Since N;/N 
is the probability p; for the component i, we find Sy/N = sm = —k >> i Pi ln p; for 
the mixing entropy per particle. This fits very well with the notion of information 
entropy (Sect. 6.1.6). 

The mixing entropy depends only on the different particle numbers, not on the 
consistency. This leads to Gibbs’ paradox. According to classical conceptions the 
difference between the particle types would have to vanish continuously. Even though 
a mixture would then no longer be conceivable, the last equation would still be valid. 
According to quantum theory, the transition is not continuous, however. 

We found the Gibbs—Duhem relation G = uN for pure homogeneous systems in 
Sect. 6.4.6 and now want to generalize it to systems of different materials (as long 
as they do not react chemically). For homogeneous mixtures of different particles 
(e.g., solutions), the equilibrium condition for given T and p is 


o ( oC ) 
= ON; T.pANegi) 


G is a homogeneous function of first order in the particle numbers N;, since thermo- 
dynamic potentials of homogeneous systems are extensive variables. For arbitrary 
x > 0, we have x G(T, p, Ni, No,...) = G(T, p, xN, x No, ...). If we differenti- 
ate this with respect to x at the position 1 and make use of Euler’s theorem for 


6.5 Results for the Single-Particle Model 587 


homogeneous functions, we may deduce the important generalized Gibbs—Duhem 


relation 
G= Yo ui N; . 


Here the mixing entropy also affects the free enthalpy, and in particular, g; = G;/N; 
denotes the free enthalpy per particle for pure systems (Gpure = >; Gi). Then for 
mixtures (of ideal gases), we have G = Gpure — T Sm, and hence, 


Ni 
G= ON (si +47 In). 
From the comparison with the generalized Gibbs—Duhem relation, we conclude that 


N; 
Hi = gi + kT lig g 


The mixing entropy thus lowers the chemical potential, which is now different from 
the free enthalpy. 

We can exemplify the above by considering the thawing of ice with salt, assuming 
that the salt is dissolved only in the water, but not also in the ice. At the transition 
point, both phases have to have the same chemical potential. If, in a similar way to 
p. 563, we denote the solid phase by [ ] and the liquid by ( ), then at the freezing 
temperature of pure water, we have g; ı(T, p) = go (T, p), in contrast to the freezing 
temperature of salt water: 


Nw 


T+ AT, = T+AT, +k(T + AT) In ———_. 
8116 P) = aC p)+k( ) Ne + Ne 


Therefore, to the first approximation, 


AT (2), + In(1+ =)| = kT In(1+ 5s) , 


where, since dG = —S dT + V dp and dH = T dS + V dp, we may use 


a(s = 80) Ah 
(aar hanay CO: 
The reduction in the freezing temperature is thus 


kT? In(1 + Ns/Nw) 
Ah +kT In(1+ Ns/Nw) ` 


AT = 


For small salt concentrations and for one mole, it follows that 


588 6 Thermodynamics and Statistics 


where AH is the melting heat of water per mole (6 kJ). Every percent of salt lowers 
the freezing temperature by one degree centigrade. 

If we now also allow for chemical reactions, then the equilibrium condition 
>); viui = Oonp. 561 initially delivers the equation }°; v;g; = —kT 7; In(N;/N)". 
Hence, we have the law of mass action, viz., 


I($)" =e Me = kp), 


with given fixed temperature and pressure. Of interest is then the difference between 
the free enthalpies before and after the reaction, in contrast to the difference between 
the free energies for isochoric instead of isobaric processes. The equilibrium constant 
K depends on the chemical consistency of the materials, but not on the concentration 
(which is of course the important aspect of the law of mass action). 

The temperature dependence of the chemical reaction follows from 


5), e ar), 


Hence, with (dg/dT), = —s and g + Ts = h, we obtain 


; Vihi 
te 


For constant pressure, heating thus shifts the reaction equilibrium in favor of the 
enthalpy-rich side (endothermic reaction). 


6.5.6 Degenerate Fermi Gas and Conduction Electrons 
in Metals 


For typical temperatures, the conduction electrons in metals form a degenerate Fermi 
gas. According to the considerations on p. 582, their chemical potential jz for the 
temperature T = 0 is equal to the Fermi energy ep = De /2m. On p. 525, we deter- 
mined the number of motional states whose energies e; are smaller than the Fermi 


energy: 
7 V 4r V /2m 


3/2 3/2 


Furthermore, two spin states are associated with each of these states, so for N elec- 
trons in the volume V, we obtain the Fermi energy 


6.5 Results for the Single-Particle Model 589 


Fig. 6.20 Fermi 1/{1 + exp[(e—p)/kT]} 
distributions for T/T) = 5 1.0 

(red curve), 1 (blue curve), ska" 
and 2 (green curve). Note 
that, in Fig. 6.18, there is 
only a single curve, because 
for each temperature a 
different energy unit was 
taken. Here, the one-particle 
ground state energy lies very 


0.0 
far to the left! 


i? 2 Ny\2/3 


In metals, this energy is very much higher than kT (even at 1000 K) and the electron 
gas is therefore degenerate (see the Fermi distribution function in Fig. 6.20). 
When computing mean values for Fermi gases, we always encounter expressions 


like . ay 
(A) = 2 ai (ni) = 2a (exp + + 1) : 


I l 


for which we shall now give a useful computational method for low temperatures. 
For high temperatures, we would have an ideal gas. If the values a; depend only 
weakly on the index i and if sufficiently many states contribute, the sum may be 
replaced by an integral: 


aye fo a(e) g(e) de 
o exp{(e—m)/kT}+1 


where g(e) is the density of states for a particle. Note that we have to add an argument 
e in order to avoid confusions with the free enthalpy per particle. For T = 0, and 
therefore u = ep, only the integral from 0 to eg is important—the denominator there 
is equal to one. However, with increasing temperature, the states for e ~ ep are 
reshuffled (see the last figure). 

For the expansion in terms of powers of T, we consider the expression 


F [ fa) ds with 8 > 0 and Bxy > 1 
= ‘ Xi ; 
o exp(B(x — x0)} + 1 i 


i.e., actually for u >> kT, which applies to a degenerate Fermi gas. With F(x) as 
“anti-derivative” to f(x) passing through zero, thus f(x) = dF/dx and F(O) = 0, 
after integration by parts, we obtain 


_ F(x) 
~ exp{B(x — x0)} + 1 


ie F(x) l dx 
0 0 dx exp{B(x — xo)} + 1 


590 6 Thermodynamics and Statistics 


The first term on the right vanishes because F'(0) = O and the denominator for 
x — oo is too large, while it is clear that only the integrand near x ~ xo contributes 
to the second. Therefore, we expand F(x) in a Taylor expansion about this position 
to obtain 


1 d'F 


at n 
n! dx” | 


L (x— xo)” = dx 
ao Xo)}+1 


n=0 


With z = B(x—xo) and d(e?+1)7!/dz = — (e? + 1)~? e7, it follows that 


i y d —] a 7. z” dz 
( a ee ay (EFD ` 


Because of the denominator, the important contributions to the integrand come only 
from z © 0, since we assumed Bx > 1. Therefore, the lower integration limit may 
be taken as —oo. Then terms with n odd do not contribute, and for n even, 


foe) z” dz foe) d 1 
=s ee 
a (EFI) (e™+1) o deer 


which gives 1 forn = 0. Forn > 0, we integrate by parts and use z”/(e*+ 1)|j° = 0: 


F z” dz 5 F z”! dz 
— Z ZN e e aes 
-œ (e+ 1)(e=7+1) o &+l 
In the next section (on bosons), we shall arrive at nearly the same integral, except that 
there, — 1 occurs in the denominator instead of +1. Therefore, for n € {1, 2, ...}, we 


consider here the two denominators simultaneously and expand e~*/(1 =e“) in a 
geometric series: 


[0,9] 


o z”! x -yo f gol eae dz = n- DY (+) 
0 2. +h" 


k=0 


Both sums lead to Riemann’s zeta function (see Fig. 6.21): 


(oe) 


1 
SE eae for Rez >1, 
k=0 


because the alternating sum (for fermions) is equal to (1 — 2(5)") €(n), given that 
1+k is even for all negative terms and their sum leads to i(n). We need ¢ (2) = 
1/6 and in the next section ¢(4) = */90, but later also ¢(3), EG), and E). The 
two values for ¢ (2) and ¢ (4) result from a Fourier expansion of the meander curve 


[9]. 


6.5 Results for the Single-Particle Model 591 


Table 6.2 Riemann zeta 


: x t(x) 

function for 1 < x < 4. See 
also Fig. 6.21 1.0 o0 

1.5 2.612375 

2.0 1.644934 

2.5 1.341487 

3.0 1.202057 

35 1.126734 

4.0 1.082323 
Fig. 6.21 Riemann zeta C(x) 
function for 1 < x < 4. See 
also Table 6.2 

2 
1 
1 2 3 1 x 
We thus obtain the expression up to order n = 2 (and 3): 
1 x? df 
F=Fa@)+-—5—| a 
6 B? dx lx=x 
or for the Fermi distribution, as the weight function? 
(a — 2) - em) + 
xX el(xo— xX) — xX — xo) t- 
exp{B(x — xo)} + 1 6p? 


in an integral, with the step function ¢(x) mentioned on p. 18 and the derivative 5’(x) 
of the Delta function. 
Putting all this together, we thus have for the degenerate Fermi gas, 


nm a ô 
(A) © AW) += ET? SOO) 
6 de e=L 
with A(u) = Seale’) g(e’) de’. Here, since dA/de = ag(e), A(u) differs from 
A(ep) = (A)r—o by approximately (u — ef) a(ep) g(ep). In order to evaluate the 
chemical potential u (T), we consider the particle number, which does not depend on 


7In nuclear physics, the radial distribution of nuclear matter is similar to a Fermi distribution [10]. 


592 6 Thermodynamics and Statistics 


the temperature, and hence take a(e) = 1. Then (u — ep) g (ep) + in? (kT) g'er) © 
0. If we use this in 


(A) — (A)r=o © (u — ep) a(er) g(eF) + in?’ (kT)* {a' (er) g (er) + aler)g' (eF)} , 


then all terms on the right cancel out except for the term in? (kT)? a' (ep) g(ep). The 
only thing missing is the density of states g (ep). Here, Q(e) « e*/*, and the further 
factor is equal to N ep 7/7, so g(ep) = 3N /er. From this, for T ~ 0, we find the 
important result 

m? N 


(A) ~ (A)r=0 + a' (ep) (kT)? . 


4 ep 


If we take this expression for the internal energy, then a (e) = e and thus a’ = 1. Near 
the origin, the internal energy of a degenerate Fermi gas increases with the square of 
the temperature. Hence, 


cy = (= 


m z m? kT 
ƏT/ vn ` 


For the chemical potential u(T) ~ ep — En?’ (kT)’8g' (ep)/g (er), using 


3 
gex z N/er yeler, 


and thus g'(e)/g(e) ~ sel, we find (see Fig. 6.22) 


uo x afi- BCE) 


Thus, it varies as T? for a Fermi gas near the zero temperature, whereas it varies 
linearly with T for a Bose gas because according to p. 582, we then have u(T) ~ 
eg — kT/N. As expected according to p. 572, the chemical potential decreases with 
increasing temperature in both cases. 

The “high-temperature expansion” in Fig. 6.22 relies on Zc(1) = V/A? (see 
p. 583), but uses a more precise expression for the chemical potential, and in par- 
ticular, one which differentiates between bosons and fermions. For sufficiently high 


temperatures in 
ej — u a 
N= (ex 1) : 
awe ae 


we have u < 0 and hence e; — u > 0. After multiplying by exp{—(e; — )/kT}, 
each term can be expanded in a geometric series. After reordering the series, it 
follows that 


6.5 Results for the Single-Particle Model 593 


ujer 


Fig. 6.22 The chemical potential of an ideal monatomic Fermi gas as a function of temperature, 
relative to the Fermi energy ep = (977/16) 1/3 k To with k To from Fig. 6.19. The continuous red curve 
corresponds to the high-temperature expansion, the dashed magenta curve to the low-temperature 
expansion, and the dotted blue curve to a Bose gas (see Fig. 6.29 for Bose-Einstein condensation) 


Li,(z) 
3 


Fig. 6.23 The logarithm Li, (z) = yes fas for |z| < 1, continuous for x = 1 (green) and 2 (red), 
dashed for x = 3 (blue) and 3 (black). The name stems from Li; (z) = — In(1 — z). Then Liz (z) is 
d Lix = Lix—1(z) 
dz z 


also called the dilogarithm. Furthermore, Li, (1) = ¢ (x) and (also for |z| > 1) 


N= > ay exp(n+) 5) Dex(-@+ 75) : 


We may write the last sum as VA3(T/(n + 1)). With A(T) œ T7! and the abbre- 
viation o = exp(u/kT) for the fugacity, we obtain an implicit equation for the 


determination of the chemical potential, which contains the polylogarithm Li, (z) 
(see Fig. 6.23): 


vV & (ta )”" V H 
N= Sct Li (+ £) . 
WT) ir Oy OP” eg 


n=1 


594 6 Thermodynamics and Statistics 


6.5.7 Electromagnetic Radiation in a Cavity 


An interesting and important system consists of photons in a cavity of volume V. 
They may be absorbed or emitted by the walls so the particle number is not fixed, not 
even on average. Therefore, there is no chemical potential (u = 0), and the canonical 
ensemble suffices with the free energy as thermodynamic potential: 


PRS ine y h <=). 


The second equation holds, because we are dealing with bosons. They move with 
the speed of light. Therefore, we have e; = hw; = hick; with k; = n; 7/L, as on 
p. 525, so w; = n; wc/L. Since there are two polarization possibilities (helicities), 
the number of states follows from 


If we replace the partition function by an integral, we obtain 


F kT [® hw, , kT /kT\3 f% sap dh 
aa In(1 — exp Jo dw = (— i In (1 — e™*) x2 dx . 
V m?c? 0 kT We hic 0 


According to the last section, integration by parts with x? In (1 — e~*)|%° = 0 yields 


og 1 (® x3 dx a+ 
f In (1 — e™) x? dx = / - =-264=-—. 
0 3 Jo e*-1 


With the Stefan—Boltzmann constant (see p. 623) 


m? ki 
a 60 ZE’ 
the result reads 
4o 4 
F = —— VT 
3c 


For the radiation pressure p = —(0F /0V)r and the entropy S = — (Ə F /ƏT)vy, this 
gives 


The pressure does not depend on the volume. For the free enthalpy G = F + pV, we 
obtain the value 0, as expected from the Gibbs—Duhem relation with u = 0. Clearly, 
TS = —4F = 4pV and thus 


6.5 Results for the Single-Particle Model 595 
o 4 
U=-3F =4—VT and p= 
c 


For ideal gases, we also have p x U/V, but with the factor z for the monatomic 
gas—for v < c the pressure is twice as large as for v © c. The frequency of collisions 
of the molecules is proportional to their speed, and the recoil proportional to their 
momentum. The product of velocity times momentum is important for the pressure. 
In the relativistic regime, it is equal to the energy (see p. 245), but twice as large in 
the non-relativistic regime. 

If the wall has a hole of area A, then the energy per unit time that flows from the 
cavity is the area times the light intensity, viz., 


AcU 1 l 
TOA cos d2 =A 40T; f cos dcosé , 
V 4r on 0 


A-I 


where 0 is the angle between the current direction and the normal to the area. This 
then leads to the Stefan—Boltzmann equation 


Il=oT", 


where the Stefan—Boltzmann constant o was already introduced above. 
According to p. 580, the average number of (polarized) photons in the ith one- 
particle state is given by the Planck distribution: 


1 


m) = oir /eT) = 1 


For the frequency interval dw, the energy density is therefore (see Fig. 6.24) 


dU ho w dw 


V exp(hw/kT)—1 r? ` 


This Planck radiation formula freezes high frequencies, while for low frequencies 
it goes over to the Rayleigh—Jeans law 


7 


dU w dw 


me’ 


which was originally derived for classical oscillators. According to the equidistribu- 
tion law, each one contributes kT to the internal energy. But this led to the ultraviolet 
catastrophe: U/V was not finite. 

The maximum of the energy density as a function of the wavelength 7 = 2r c/œw 
follows with |w>dw| = (2mc)*A>dd from F = he/(kTA) = 5{1 — exp(—x)} as 
T = 4.965114231745. Together with the second radiation constant co = hc/k (see 
Fig. 6.24), this leads to 


596 6 Thermodynamics and Statistics 


p(A,T)/TW m-4 
100 


6000 K 


50 
5000 K 


0 l 2 A/pum 


Fig. 6.24 Planck’s radiation distribution g(A, T) = c1 5 /{exp(c2/(AT)) — 1} with the first radi- 
ation constant c} = 27x hc? and the second radiation constant cz = hc/k. Here g is the radiation flux 
density emitted into a half space, viz., ọ = łc du/da. The factor łc was derived for the Stefan- 
Boltzmann equation. Three isotherms are shown. The visible light range (400 nm < à < 750 nm) 
is indicated by dashed lines. The temperature of the surface of the Sun is such that a lot of visible 
light is emitted (adaption of the eye) 


1 he C3 
E kT 4.965114231745 T ` 


t= 


This is Wien’s displacement law—the higher the temperature, the shorter the most 
intense wavelength. As a function of the angular frequency w (or the energy ñw), the 
maximum follows from x = h@/(kT) as x = 3{1 — exp(—X)} = 2.821439372122. 
Incidentally, according to the above equation for (;), the total number of photons 
in the volume V may be evaluated from N/V = 2¢(3)a~? (kT /he)? with ¢(3) = 
1.202. This depends strongly on the temperature. With this value, we find U = 
m*/(30¢(3)) NkT ~% 27 NKT and hence the average energy per photon. 


6.5.8 Lattice Vibrations 


In a solid, each of the N atoms may oscillate about its equilibrium site. Here we may 
restrict ourselves to harmonic oscillations with small displacements and introduce 
3N normal coordinates (see Sect. 2.3.9). We can then describe the motion of the atoms 
as 3N decoupled oscillations—sound waves, corresponding to phonons as quanta, 
without fixing their number. They obey Bose-Einstein statistics. In contrast to the 
photons in the last section, we have only a finite number (3N) of eigen frequencies, 
in particular a limiting frequency @max. 

The excitation energy of the states |n1, n2, ...)s iS pan ni ha;. Since the number 
of phonons is not limited, we consider—as for photons—the canonical partition 


6.5 Results for the Single-Particle Model 597 


function 


3N 
Xni nhor 1 
Z = kd 
on de r n ecran 


or In Zc = — ya In(1 E exp(—hw;/kT)). The energy is therefore 
3N 


(a ha; 
ONE = exp(ha;/kT) — 1 


and the heat capacity at constant volume (fixed frequencies) is 


3N 
aU 1 ho; 2 ha; 
Cy = ) 2 ) e , 
v e vT Llst P pT 


For kT >> ha@max, we have the Dulong—Petit law Cy ~ 3 Nk, which follows from 
the equidistribution law for all temperatures. 

With decreasing temperature, ever more degrees of freedom freeze, and for low 
temperatures, only the low frequency eigen oscillations are important, i.e., the normal 
oscillations with longer wavelength. These wavelengths are essentially longer than 
the interatomic distances, and we may make an ansatz for the density of states 
œ œ? (according to Debye) like the one for the electromagnetic radiation in a cavity. 
However, we have to account for the fact that there is an upper bound @max for the 
eigen frequencies: 


w= 9Nwp?@ for wo < wp = Omax , 
eD) = 1 o otherwise . 


The factor 9N@p~? follows from the constraint 3N = i” p(w) dæ. This yields 


U= f ho {exp(iw/kT) — 17! gp(@) do , 
0 


for the energy, or 
U=9NKT fp(hap/kT) , 


with the Debye function fp(x), which is displayed in Fig. 6.25. 


598 6 Thermodynamics and Statistics 


Ee DPE: 

folz) is" [x 

1 /* y dy 
xX Jo &-1 


fo(x) = 
4 


1x ro y3 d 
-3-[34) o 
X 15 Jt -l 


0.0 
0 5 10 z 


Fig. 6.25 Debye function (continuous red curve) and its approximation krl /x? (dashed blue 
curve) 


Fig. 6.26 Temperature 374(T/Tp)* U/NkTp —3F/NkTp 
dependence of the lattice 
energy. For T « Tp, we 
have U + —3F ~ 
3x4NkTp (T/Tp)* 


It is also common to introduce a Debye temperature Tp = hwp/k (200-300 K). 
For T < Tp, the last integral is not important. Then, for the heat capacity, 


1274 T \3 
cy s P m(ZY. 
5 Tp 
In fact, for low temperatures, Cy « T? is observed, except for metals at very low 


temperature. (There the conduction electrons contribute, and their heat capacity is 


proportional to T according to p. 592.) Integrating by parts, the free energy is obtained 
from 


ee —ho 
F=-kT InZe = 47 Í n(1 — exp ——) gp(w) deo 
A kT 


3NkT {in(1 exp a) fo(2)| 


For low temperatures, F = —iU and S$ = 5Cy œ T? (see Fig. 6.26), like for elec- 
tromagnetic radiation in a cavity for all temperatures. Note that, for the harmonic 
oscillations about fixed positions we are concerned with here, F does not depend on 
the volume, so a pressure cannot be derived for phonons. 


6.5 Results for the Single-Particle Model 599 


6.5.9 Summary: Results for the Single-Particle Model 


In this section, we calculated partition functions for several examples and thereby 
derived the equation of states, thus verifiable statements, which were not always 
obvious for the original many-particle problem, where quantum theory was always 
necessary. Classical physics leads to internal contradictions, e.g., to Gibbs’ paradox 
(the entropy has to be an extensive variable) and to the ultraviolet catastrophe. Here 
we have restricted ourselves to examples which can all be described in the single- 
particle model of independent quanta: gases, conduction electrons, electromagnetic 
radiation, and lattice oscillations. Here the first two examples were treated as grand 
canonical ensembles, because the particle number is an important parameter for them, 
and the last two as canonical ensembles, because the number of oscillation quanta 
(photons, phonons) cannot be given as a fixed variable in those cases. 


6.6 Phase Transitions 


6.6.1 Van der Waals Equation 


The equation of state of ideal gases assumes sufficiently high temperatures, because 
real gases behave differently at lower temperatures, when interactions between the 
molecules may no longer be neglected. These interactions are strongly repulsive for 
small distances and weakly attractive for large distances. If the electronic shells of 
two molecules overlap, they repel each other strongly, so we assign a volume b to each 
molecule which is inaccessible to the others. Then the volume in the gas equation 
must be replaced by V — Nb = N (v — b). At greater distances, on the other hand, 
the molecules attract each other weakly like electric dipoles. It is not necessary for 
permanent dipole moments to exist here. Before the quantum mechanical average, all 
molecules have dipole moments, whose coupling does not vanish under the averaging 
process. This attraction reduces the pressure on the outer walls and is proportional to 
the product of the molecular densities in the interior of the volume and at the surface, 
hence proportional to v~?. Therefore, in the gas equation, we have to replace the 
pressure by p + av~*. We thus generalize the equation pv = kT for ideal gases to 


the van der Waals equation 
a 
(p+ 5)(v-5) ur. 
v 


These additional terms contribute only for comparably small v = V/N. 
Of course, the equation only makes sense for v > b. But it does not generally hold 
even then, because it is an equation of third order in v(p, T), viz., 


pv? — (bp + kT) v? +av—ab=0, 


600 6 Thermodynamics and Statistics 


and therefore allows for three different densities N/V. In fact, the van der Waals 
equation describes not only real gases rather well, but to some extent also liquids. It 
only gets things wrong for the phase transition. This is not so surprising, because so 
far we have assumed homogeneous systems rather than a spatially separated gas and 
liquid with their different densities. 

How should the van der Waals solution be modified in order to describe the phase 
transition without contradictions? Here we argue that, of three real solutions v(p, T), 
the one with the highest density (lowest v) should hold for the liquid and the one with 
the lowest density (highest v) for the gas. For given p and T, the two phases exist 
simultaneously between these densities. For the phase transition, despite a change 
in v, we nevertheless expect p and T to remain constant. If we take, e.g., isotherms 
as functions p (v), then the van der Waals solution in this ambiguous regime should 
be replaced by a horizontal straight line segment. 

In order to determine the pressure at which this straight line segment is to be taken, 
we have to respect the free enthalpy and the equilibrium condition jz; = u2 for the 
phase transition. We have dN; = —d N; and dT = 0 and therefore dG = V dp. The 
area f V dp between the van der Waals isotherm and the straight line segment has to be 
(Maxwell construction) chosen such that f dG vanishes, because G is a state variable. 

The van der Waals equation does not therefore always deliver (9p/ðv)r < 0, as 
it actually should according to p. 560 with (AV)? > 0. Given that 


(8p/dv)7 = —kT/(v—b)? + 2a /v? , 


in particular, the stability condition requires 2a (v — b)? /v? < kT. This is not always 
satisfied for low temperatures. The stable phase becomes unstable if we have equality 
here and in addition (a7 p/d v?)r vanishes, which leads to kT = 3a (v — b} mt. At 
the critical point for the stability, it is clear that kT, = 2a (v.—b)? / Ue? = 3a (Ue — 
by / vet, whence 


aay. kT.=2 db eS 
Ue = 3b, = and Pe = 5555 » 


and thereby peve = $ kTe, in contrast to an ideal gas. Note that the van der Waals 


equation holds only approximately here. Instead of p = 0.375, we observe 0.31 for 
Op», 0.29 for No, and 0.23 for HO. With the reduced quantities v, = v/ve, T, = T/T, 
and p; = p/ Pc, the van der Waals equation reads (see Fig. 6.27) 


(p+ Z )Gn =i =i 


The parameters a and b are then hidden. 


6.6 Phase Transitions 601 


Fig. 6.27 Van der Waals 
isotherms with T, = 1.2, 1.0, 
and 0.8. The middle red 
curve is the critical one, 
while the lower curve 
corresponds to a phase 
transition. Also shown here 
is the unstable solution of the 
van der Waals equation 
(dashed curve) 


6.6.2 Conclusions Regarding the van der Waals Equation 


For the stress coefficients 6 = (0p/dT),, the van der Waals equation implies 


B= k =4(p+5) 
= gon T" "P 


According to p. 570, (0U/dV)7 = —p + BT. This is now equal to a/v”. Thus the 
potential energy of the cohesive forces between the molecules contributes to the 
internal energy. This addition depends in fact on the volume per particle, but not on 
the temperature. Therefore, we also find 


dCy 3U 


av əvər 


as for an ideal gas. 


On the other hand, according to the equation for (dp/dv)7 mentioned in the last 
section, the isothermal compressibility is 


9 -1 (v — by’ 
kr =—{v(5), f re ere 


so for the expansion coefficient, we have 


1 vu—b 
T v—(2a/kT)( — b/v)? ` 


a = kr = 


According to p. 575, 1—«&T is important for the Joule-Thomson experiment, because 
(dT /dp) contains only the extra factor — V / Cp: 


602 6 Thermodynamics and Statistics 


b — (2a /kT)( — b/v}? 
l—aT = : 
v — (2a /kT)(1 — b/v)? 


If we keep only terms of first order in a and b, then this is equal to (b—2a/kT)/v. 
It is negative for low temperatures and delivers (0T/dp) 4 > 0. All real gases may 
be cooled to low temperatures by decompression (dp <0). But for normal temper- 
atures, this does not hold for hydrogen and the noble gases. Their cohesive forces 
are then weak (a is small), so for normal temperatures these gases heat up under 
decompression. Indeed, highly compressed hydrogen ignites upon streaming out of 
leaks. 

We can only differentiate the remaining thermal coefficients if we know the 
entropy or one of the thermodynamic potentials. As for the ideal gases, the internal 
degrees of freedom of the molecules are important, and here we proceed as for the 
ideal gases. For the change, we account only for the center-of-mass motion. 

Here we disregard the feedback of a given molecule on the others and describe 
the coupling by an effective one-particle potential V (r). Note that, in order to avoid 
confusion we shall always indicate the position with the volume V. Then the classical 
canonical partition function due to the center-of-mass motion of a molecule is 


1 1p 3. a3 
Ze(1) = 3 fel- $ va)| drop, 
and according to p. 583, 
=V (r) 3 : h 
—> dr, with à = —_.. 
kT ~ 27 mkT 


If at first we neglect the attractive forces and account only for the strong repulsion, 
then the integral yields N (v — b). The weak attraction is approximated by the mean 
value V(r) © —a/v: 


Zc(1) = a f exp 


a/v 
Zc) =°? N (v —b a 
c(1) (v — b) exp ‘iT 


In addition, for independent particles, according to the corrected Boltzmann statistics 
(see p. 578), we have 
Zc(1) 
In Zc(N) =N no . 


With this we obtain the free energy 


3 a 
F=-kT In Ze =N(kT In = =) 
v—b v 


and p = —(3F/3V)r,y = —N7! (3F/ðv)r,y = kT/(v — b) — a/v? for the pres- 
sure. Thus we have derived the van der Waals equation in a different way. (For 


6.6 Phase Transitions 603 


molecules containing more atoms, F also contains additions, and according to 
Sect. 6.5.4, these in fact depend upon T, but not on V, whence we obtain the same 
pressure.) But the entropy S = —(Ə F /ƏT Jy y for a real gas is lower than for an ideal 


one: 
v—b 


- = Nk m(1- ?). 


In addition, the chemical potential u = (9 F/ƏN)r,vy is different: 


Sreal = Sideal = Nk In 


—b b 2 
v +kT a 


Hreal — Mideal = —kT ln =. 
v—b v 


6.6.3 Critical Behavior 


The free enthalpy depends on the aggregation state and determines whether a probe 
exists in the form of gas or liquid (or solid)—only the phase with the lowest free 
enthalpy is stable, as we already stressed in Fig. 6.17. For fixed pressure p < pe, the 
(monotonically decreasing) function G(T) has a kink at the transition temperature, 
and likewise, for fixed temperature T < T,, the function G(p) has a kink at the tran- 
sition pressure. The first derivatives (9G/dT), and (dG/dp)r have a discontinuity 
for this discontinuous phase transition, and likewise the entropy and the volume: 


dG dG_ 
-( oT i = com , 
dG, dG_ 
( ap l ( dp Jn f 
Here we also speak of a first order phase transition, because the first derivatives of 
G are discontinuous. Such phase transitions have a transition enthalpy (the pressure 


remains constant) H,—H_ = T (S,—S_) #0 and obey the Clausius—Clapeyron 
equation 


S,- S 


V,- V 


dp peo. 1 H,- AL 


dT V-V. T Vea 


’ 


discussed on p. 573. 

According to Sect. 6.6.1, the isotherm p (V) has a horizontal tangent at the phase 
transition, i.e., (0p/dV)7 = 0. Therefore, the volume (and density) uncertainty is 
infinitely large there. Otherwise, itis negligibly small for macroscopic bodies, e.g., for 
an ideal gas, we have (A V/V)? = 1/N (since (8V/dp)7 = —V/p = —V7/NkT): 


AVP = -Òir T T (ap) T/r) 


604 6 Thermodynamics and Statistics 


The density therefore fluctuates enormously at the phase transition. Hence, the 
isothermal compressibility xr = —V~!(dV/dp)r is infinite there, too, and likewise 
(if a transition enthalpy is involved) the isobaric heat capacity C, = T (0S/dT), 
and the expansion coefficient a = V~! (ð V/dT), = —V-!(@S/dp)r. 

At the critical point, S} and S_ agree with each other, as do V} and V_. A 
transition heat is unnecessary, and the first derivatives of G are continuous. But 
with (3V /ðp)r = (0?G/dp")r, the second derivative of the free enthalpy is infinite. 
Then we have a second order phase transition (a continuous phase transition). At 
the critical point, the volume is very unsharp, as for a phase transition of first order— 
the density fluctuates strongly. At the critical point, an otherwise transparent body 
scatters light very strongly and appears opaque (critical opalescence). 

We shall now investigate the behavior near the critical point. According to Car- 
dani’s formula, the cubic equation v? + 3Av? + Bv + C = Ohas the three solutions 
vi = x; — A with 


R R_ Ry — R- 
xo = R} + R- and ny = =i oy era 


and the abbreviations 


C- AB B 
Rea 02 oP with (OSAP PSS we 


2 3 
where the third root is taken such that Ry R- = —P. For real coefficients, there are 
three real solutions with Q? + P? < 0, and hence R_ = R,*. For the reduced van 


der Waals equation, we have A = —ŠT;/ pr a 3, B = 3/ p, and C = —1/pr, and 
hence, Q = A? — į (3A + 1)/p: and P = 1/p, — A’. Therefore, near the critical 
point with AT = T, — 1 and Ap = p, — 1, we have 


Ax-1+8Ap—-8AT, QiAp-4ZAT, P% 


We reach the critical point along Q = 0, i.e., Ap = 4AT. This delivers R4 
+2,/AT/3, and hence for AT <0, i.e., T < T., the two solutions vu, — 
E24 1 — T, at the phase boundary. For the density p, « v, |, it follows that 


|p = Pel xX T. = Eye : 


The density p is called an order parameter for the considered system since it has a 
discontinuity at the phase transition, and from the last relation, the critical exponent 
5 for this order parameter is extracted from the van der Waals equation. 

For the isothermal compressibility, p; = 87,/(3v, — 1) — 3u, 7 implies 


Ez) = 24T, 6 
T 


i, ao r 6T-(1— 3Av + Z (Av)?) + 6(1 — 340+ 6(Av)”) . 
r ESS r 


6.6 Phase Transitions 605 


For T > T, and Av 7% 0, this leads to xr™! = 6p. (T, — 1), but for T < T, and 
(Av) + 4(1 — T), toxr ! = 12p, (1 — T;). In total, this gives 


-1 
kp X|T—-T.| , 


where the proportionality factor for T > T, is equal to IT, / P: and for T < T, half 
as large. We usually set xr « |T — T.|7”. According to the van der Waals equation, 
the critical exponent here is y = 1. 


6.6.4 Paramagnetism 


Magnetism also provides an example of a phase transition. As for gases, we begin 
by neglecting the interaction between the atoms (paramagnetism), and include them 
in the next section in the molecular field approximation due to Weiss. 

We thus start from the magnetic moment mgjg of an atom with ug the Bohr 
magneton (see p. 327), g the Landé factor, which is equal to (27 +1)/(2/+ 1) for the 
angular momentum j = / + L, according to p. 373, and m the directional (magnetic) 
quantum number along the magnetic-field direction. The potential energy is then 


HploH 
kT ` 


Woot = —mMguplhoH =—mnkT, with n = g 


In vacuum, we have B = oH and the energy —y - B due to the coupling of the 
magnetic moment to the magnetic field. Nevertheless, here we investigate the mag- 
netization induced by the magnetic field, and use now uoH instead of B (see Sect. 
3.2.6). 

For a given magnetic field, the eigenstates of the energy are evenly spaced at 
distances n kT from each other. However, there are only 2j + 1 of them and not 
infinitely many as for a harmonic oscillator. Hence the directional quantum number 
m in the canonical partition function }_„ exp(mn) takes the values from — j to +j 
in even-numbered steps. Now 


-j 2j le xi+1/2 _ y-j-1/2 
A R l-x xy 


Hence, for the canonical partition function, we find 


j inh((j + 4 
Zo = So em = SMG + DD 
sinh(}7) 


7 


m=—j 


and clearly, pp, = Zc~' exp(mn) for the occupation probability of the states with 
the directional quantum number m. 


606 6 Thermodynamics and Statistics 


Fig. 6.28 Brillouin function 
Bj(n) for j = L, 3, and 3. 
For 7 © 0, it depends 
linearly on j, viz., 

Bin) © I (j+1)n , and for 
n> 1, Bj) ~l 
(saturation) 


For the average magnetic moment, we obtain 


Yin m exp(mn) od : sinh((j + i)n) 
En exponn) dn sinh(}7) 


m= 


The polarization m/j is therefore given by the Brillouin function (see Fig. 6.28) 


Bap = Stn SMG + DD) _ U +a cothlG + 5)m) = z coth Gn) 
i) => q n - f = . , 
J an sinh(57) j 


For j = 5, in particular, By/2()) = tanh(57) holds. Generally, B;(7) is a mono- 
tonically increasing function—the stronger the magnetic field H and the lower the 
temperature T, the better the orientation. 

For the magnetization from mutually independent moments, we obtain N/V times 
the mean value of mg upg: 


N _ Njgus guguoH 
M= 2 B;( ) , 
yoo v IKT 


So for paramagnetism at low temperatures (7 > 1), 


N 
M ~ 7 j8ue, for kT <gupuoH . 


Then it depends neither on the temperature nor on the magnetic field, and the sys- 
tem has reached saturation: all moments are oriented and the magnetization cannot 
increase any further. In contrast, at high temperatures, we obtain M « H and hence 
for the magnetic susceptibility 


M es N jG +) (gps)? Ho 
H V 3kT 


X= for kT > gugkhoH . 


It is thus proportional to the reciprocal of the temperature, which is Curie’s law. 


6.6 Phase Transitions 607 


6.6.5 Ferromagnetism 


The correlation between the atoms neglected so far (for paramagnetism) is decisive 
for ferromagnetism. Here what is important is not so much the magnetic coupling 
between the dipole moments, as the exchange symmetry of the fermion states, where 
position and spin states are important, because their product has to be antisymmetric 
under particle exchange. For this reason even the electric coupling of two electrons 
depends on the spin states. This leads to the [sing model 


Wik = —2J mi mg , 


where only nearest neighbors i and k interact, although actually the parameter J 
depends on the distances. It is adjusted, and even the sign is not the same for all 
materials. 

We follow P. Weiss with the molecular field approximation and assume an average 
one-particle potential. The coupling to the n nearest neighbors is then given simply 
by —2n Jmm, and for the average directional quantum number m, we found j B;(7) 
in the last section. The field at the position of the test particle is now composed of 
the external field and the remaining part. Thus we obtain 


Woot = —m {gupuoH + 2njB;) J}. 


As we have already done for paramagnetism, we may therefore set 


N : 
Woot = —m n kT and M = TF 84s jB;() , 


but where 7 now follows from a new equation: 


guploH + 2nj Bj(n)J 
= kT 


kTn — guguoH 
nj ` 


We have thus to find the points where the Brillouin curve crosses a straight line. 
Here the solution with the largest n > 0 is stable, because it has the smallest free 
energy, given that the partition function Zc = sinh((j + Dny sinh($7) increases 
monotonically with n, and therefore F = —kT In Zc decreases. 

The case J > 0 is particularly instructive, so we shall now restrict ourselves to 
this. For H = 0, in addition to the crossing point for n = 0, there is another for n > 0 
if 


dBy(n)| j+1_ kT — j+1T 


= , with kTe = 2nj(jtl)J. 
a ee on Se ee 


Below the Curie temperature Tc, we also find spontaneous magnetization for H = 0, 
because for J > 0 the parallel orientation is convenient for the magnetic moments. 


608 6 Thermodynamics and Statistics 


The slope of the above-mentioned straight line is proportional to the temperature, 
and therefore its crossing point with the Brillouin curve moves from T — 0 to ever 
higher values of 7. But then we may set B;(ņn) ~ 1 and find again the saturation 
magnetization. In contrast, for T —> Tc, the crossing point moves towards the origin. 
The magnetization vanishes for T = Tc. In this case, we have to evaluate B;(7) to 
a higher accuracy than we have done so far, because now also the curvature of the 
Brillouin curve is important: 


itl U+- G, 
aoe 45 j i 


The crossing point with the straight line i( J+1) (T/Tc) n then leads to 
n x1—T/Te, 


and therefore to 


M xyTce-T. 


For T > Tc and H = 0, there is no solution 7 Æ 0. 

For H # 0 this changes, because then the straight line is shifted downwards and 
therefore always cuts the Brillouin curve with n > 0, thus for T > Tc. At least for 
these temperatures and for H ~ 0, we also find n ~ 0, and therefore we may set 
Bim) © i (j+1) n. This delivers n = gupuoH/(k(T — Tc)), and hence for the 
magnetic susceptibility, 


= N jG+) (euB) no 


, for T > Te. 
V 3k (T — Tc) 


This Curie—Weiss law reproduces the observation for T >> Tc very well, but not close 
to the Curie temperature, where the molecular field approximation is too coarse. This 
means that the phase transition occurs not exactly at Tc, if we have determined this 
parameter using the Curie—Weiss law for higher temperatures. For T < Tc, 7 is larger 
than for H = 0 and the same temperature. Furthermore, the magnetization and the 
susceptibility are larger, but the saturation values remain the same. 


6.6.6 Bose-Einstein Condensation 


We have in fact already considered a photon gas and lattice vibrations, both examples 
of Bose gases, but in both cases the (average) particle number was not given. Now 
we shall go back to that case, but start with the grand canonical ensemble and take 


— (ei — 1) 


J = —kT InZogc = kT X n(1 — exp IT 
i=0 


6.6 Phase Transitions 609 


We choose eg as the zero energy and once again write o for the fugacity exp(u/kT). 
The term i=O then contributes kT In(1—o), with O<o <1. So far we have not 
accounted for this in the high-temperature expansions in Sects. 6.5.4 and 6.5.6, 
because replacing the partition function by an integral with the density of states, a 
state with the zero energy has no weight: 


d V eae V “S V 2 Je/kT 


89) = Cale) rAr 


h2 
The internal degrees of freedom are in fact frozen at low temperatures and do not 
need to be considered here, but a potential energy would have an effect. In this sense, 
we are greatly simplifying here. We now obtain 


n n oO x in we- xX , 
S a Ja 0 


where x = e/kT. Here, integrating by parts, we find 


i - 20 f” x? dx JT a" 
ji Jx Indl — o e™) dx = [ =-7 DE 


3 e*— o 
n=1 


Thus with the polylogarithm Lis/2(o) (see Fig. 6.23), we obtain 
Vv 
J=kT |Ind —-—o) —kT 3 Lis/2(o) . 
Hence it follows that 


W= M = ~() E) =] F + u ee 


The first term on the right gives the particle number (nọ) in the ground state and the 
rest then the number of particles in excited states (N*). We divide this equation by 
N and introduce a critical temperature 


3 N \22 h? (N/V 2/3 
Ba m0) 7 mani ey) 


This increases with increasing density N/V. Hence, 


oO _ Li3/2(o) CSM 
N (1—0) EG) AT 


This equation fixes o (T) for given T.. In particular, o (0) = N/(N +1) ~% 1. For 
N > 1, this barely changes up to T = T,. In particular, on the left-hand side, 


6 Thermodynamics and Statistics 


0 i AT 
0.0 — 


-0.5 


-1.0 


T 
0,0 — 
Te HI kT 


0 1 2 Te 


Fig. 6.29 Bose-Einstein condensation and its dependence on the temperature T relative to the 
critical temperature To. Left: The number of particles in the ground state (nọ) or in excited states 
N* relative to the total number N. Right: The chemical potential m, represented for N = 100 


o = 1—1//N delivers approximately 1 — 1/./N ~ 1, and the right-hand side with 
T = T, and ø = 1 thus yields 1. Here with (np) = o/(1 — o), the whole expression 
is equal to 1 — (no)/N = N*/N. Thus for T > Ty, it always stays equal to one, and 
compared with the number N of particles in the ground state, i.e., (no), this is clearly 
negligible (see Fig. 6.29): 


N* (py for TST, 
N |1 for T>. 


Here, of course, there are always more bosons in the ground state than in any other 
one-particle state—only the sum of numbers over the many excited states may be 
greater than the number in the ground state for higher temperatures. 

These considerations thus lead too © 1 for T < T, and to Liz;2(0) = BN */V 
for T > T,, so Lizz (0) = tG) (T:/ T)?/?. If we differentiate this with respect to T, 
then on the left, we have o~! Li, /2(0) -do/dT according to the chain rule, and the 
polylogarithm diverges for o = 1 (more strongly than — In x at the origin). On the 
right, for T = T,, we obtain the finite value —3¢(3) /T-. The derivative of o with 
respect to T thus vanishes at T,, and is continuous (as is the chemical potential u). 

From the generalized grand canonical potential, the pressure and entropy may 
also be derived: 


oJ kT _. 
p= =( r)a = 43 srl), 


> pV—uN 


aT T 


oJ 
s = -( ) = —k In(l—o) + 
Vou 
The bosons in the ground state do not contribute to the pressure, and for fixed T and 
u, o is also constant. For T < T,, it depends only on the temperature (and the mass 
of the bosons) (proportional to T>/), but not on the density. With decreasing volume, 


6.6 Phase Transitions 611 


Fig. 6.30 Influence of the 
Bose-Einstein condensation 
on the pressure coefficients B 
(and the isochoric heat 
capacity Cy = åy B). At 

T = Ts, we have 

B = 36(5)/6(5) NK/V. 
The dashed line is for an 
ideal gas 


T, increases and hence also (no). In other words, the particles condense. This also 
holds for the internal energy. From U = J + TS + uN, we obtain U = 3 pV. 

Clearly, the second derivatives of p and U with respect to T are discontinuous at 
T,, and so also are the first derivative of the pressure coefficients 6 and the isochoric 
heat capacity Cy, as well as the isothermal compressibility «r. Then, for the pressure 
coefficients, we obtain 6 = (0p/0T)yy (see Fig. 6.30) 


5 
> T \ 3/2 
a fr T< T, 
ge Nk | 2 ¢(3) \T 
© V | 5 Lisa) (T\3/2 3 Liz2(o) 
2 (G a, 2 Lii2(0) nes er 
(5) c 1/2 


From this, we also have the isochoric heat capacity Cy, because with U = 3 pV, this 
is equal to 3VB here. 


6.6.7 Summary: Phase Transitions 


As examples of phase transitions and critical behavior, we have investigated in some 
detail the van der Waals gas, magnetism in Weiss’s molecular field approximation, 
and Bose-Einstein condensation. Here the van der Waals equation had to be amended 
by the Maxwell construction, to make the volume a unique function of pressure and 
temperature. 

A phase transition of nth order has a discontinuity in the nth derivative of the free 
enthalpy. The Clausius—Clapeyron equation holds for phase transitions of first order. 
At the critical point, there is a phase transition of second order. Here the density p or 
the magnetization M are taken as the order parameter. Below the critical temperature, 
their value jumps at the phase transition, but it is continuous above. At the critical 
temperature, the isothermal compressibility ky and the susceptibility x are infinite. 


612 6 Thermodynamics and Statistics 


Problems 


Problem 6.1 Legend tells us that the inventor of chess asked for $ = aes 27 grains 
of rice as a wage: one grain on the first square, two on the second, and twice as many 
on each subsequent square. Compare the sum S for all the squares with the Loschmidt 
number N, © 6 x 107. How often can the surface of the Earth be covered with S 
grains, if 10 of them are equivalent to 1 square centimeter? By the way, 29% of the 
surface of the Earth is covered by land. (3 P) 


Problem 6.2 Justify Stirling’s formulan! = (n/e)” v 27n with the help of the equa- 
tionn! = dee x” exp(—x) dx, using a power series expansion of n In x — x about the 
maximum and also by comparing with In(n!), n ln(n/e), and n In(n/e) + 5 In(27rn) 
for n = 5, 10, and 50. (9 P) 


Problem 6.3 Draw the binomial distribution p; = (2) p(l — p)*~* when Z = 10 
for p = 0.5 and p = 0.1. Compare this with the associated Gauss distribution (equal 
to (z) and Az) and for p = 0.1 with the associated Poisson distribution. Note that the 
Gauss and Poisson distributions also assign values for z > 10, but which we do not 
want to consider. For comparisons, set up tables with three digits after the decimal 
point, no drawings. (8 P) 


Problem 6.4 From the binomial distribution for Z >> 1, derive the Gauss distribu- 
tion if the probabilities p and q = 1 — p are not too small compared to one. 


Hint: Here it is useful to investigate the properties of the binomial distribution near 
its maximum and let p depend continuously on z. (8 P) 


Problem 6.5 How high is the probability for z decays in 10 seconds in a radioactive 
source with an activity of 0.4 Bq? Give in particular the values p(z) for z = 0 to 10 
with two digits after the decimal point. (6 P) 


Problem 6.6 Which probability distribution {p,} delivers the highest information 
measure I = — DF p: 1b p:? 


Hint: Note the constraint er p. = 1. 


How does 7 change if initially Z; states are occupied with equal probability and then 
Z2 < Z,? Freezing of degrees of freedom: Determine AT for Z; = 10 and Zz = 2. 
For two possibilities, Z may be written as a function of just p = p (z1). Set up a table 
of values with the step width 0.05. (6 P) 


Problem 6.7 In phase space, every linear harmonic oscillation proceeds along an 
ellipse. How does the area of this ellipse depend on the energy and oscillation period? 
By how much do the areas of the ellipses of two oscillators differ when their energies 
differ by iw? Determine the probability density p(x) for a given oscillation ampli- 
tude xo and equally distributed phases g. 


Problems 613 


Hint: Thus we may set x = xq sin(wt +ø). Actually, the probability density should 
be taken at time t. Why is this unnecessary here? (7 P) 


Problem 6.8 A molecule in a gas travels equal distances / between collisions with 
other molecules. We assume that the molecules are of the same kind, but always at 
rest, a useful simplification which does not falsify the result. Here all directions occur 
with equal probability. Determine the average square of the distance from the initial 
point after n elastic collisions, and express the result as a function of time. (4 P) 


Problem 6.9 Does p(t, r) = V4 Dt i exp(—r? /4Dt) solve the diffusion equa- 
tion do /dt = DAp, and does it obey the initial condition p (0, r) = 6(r)? What is the 
time dependence of (r7)? Compare with Problem 6.8. How do the solutions p(t, r) 
read in one and two dimensions? (9 P) 


Problem 6.10 Consider N interaction-free molecules each of which is equally prob- 
able in any of two equal sections of a container. What is the probability for all N 
molecules to be in just one of the sections? If each of the possibilities since the 
existence of the world (2 x 10!°) has occurred corresponding to its probability, how 
long have 100 molecules (very, very few for macroscopic processes!) been in one 
section? (2 P) 


Problem 6.11 Given the Maxwell distribution 
p (v) = 4r v? (2na kT /m) °?’ exp(—mv7/2kT) , 


determine the most frequent and the average velocities (0, (v)), kinetic energies (E ; 
(E)), and de Broglie wavelengths (A, (A)). 


Hint: 


CO 
/ exp(—ax") dx = —,/—-, 


0 2Va 


f x” exp(—ax°) dx = (—)" f exp(—ax?°) dx = ee E 
0 0a” 0 2n+l g” a 


i 1 [(” n! 
/ xt! exp(—ax7) dx a y” exp(—ay) dy = ——_ . 
0 0 


2 Jqnrtl 


The first integral is half as large as JS and the latter equal to the square root of the 
surface integral 


CO [0,6] io) 
If exp{—a (x? + y°)} dx dy = anf exp(—ar’) r dr = nf e dx. 
—90 0 0 


(8 P) 


614 6 Thermodynamics and Statistics 


Problem 6.12 Consider the 1D diffusion equation dy/dt = D d7y/dx* with the 
boundary condition y(t, 0) = c(0) exp(—ia@t). Which differential equation follows 
for c(x), and what are its physical solutions for x > 0? (Example: seasonal ground 
temperature.) (3 P) 


Problem 6.13 Under what circumstances do the Maxwell equations yield a diffusion 
equation for the electric field strength? How large is the diffusion constant under such 
circumstances? (3 P) 


Problem 6.14 For a molecular beam, all velocities v outside of a small solid angle 
dQ around the beam direction are suppressed. How large is the number of sup- 
pressed molecules with velocities between v and v + dv per unit time and unit area? 
Determine the most frequent and the average velocity in the beam. (4 P) 


Problem 6.15 According to quantum theory, the phase space cells cover the area h. 
Therefore, according to Problem 6.7, the number of states of one linear oscillator up 
to the highest excitation energy E is equal to Q(£, 1) = E/ha+1=n-+1, with 
the oscillator quantum number n. Determine Q (E, 2) for distinguishable oscillators 
and then Q (E, N) by counting. Simplify the result for the case n >> N. Is the density 
of states for this system equal to + ENT! (hw) -"? 


Hint: The binomial coefficients for natural m and arbitrary x are given by 


x seal Gamal) x-m+1 x 
(*)- m! 7 m laa) 


Consequently, 


Ga) (EG). minem (2-0 


In addition, 


1 1 n—m P k 
ET = + : , and hence ma = 5 ý . 
m m m— 1 m+ 1 rar m 


(6 P) 


Problem 6.16 From the expression found for Q(E, N) in Problem 6.15, determine 
the canonical partition function and hence the average energy (E) and the squared 
relative fluctuation (AE/(E )). (4 P) 


Problem 6.17 The energy of N non-interacting spin-4 particles with magnetic 
moments u in the magnetic field is E = (nj; — ny4) uB. What is the micro- 
canonical partition function of this system? (4 P) 


Problems 615 


Problem 6.18 Take the result of the last problem as a binomial distribution (with 
the energy as state variable), and approximate it by the Gauss distribution for uB « 
dE < E. Thereby determine the entropy. How does the entropy differ from the one 
found in Problem 6.17, obtained with the Stirling formula for E <« NuB? (6P) 


Problem 6.19 Determine, as for the equidistribution law, (p,x’") and (x” pn) for 
canonical ensembles of particles which are enclosed between impenetrable walls. 
Why are these considerations not also valid for unbound particles? (4 P) 


Problem 6.20 For an N-particle system, the expression Tui r; - F; is called the 
virial of the force. What follows for its expectation value? Compare the result with 
(Exin) = N 3 (X - x) and with the virial theorem of classical mechanics. Note that this 
holds for the mean value over the time (!), and in fact for “quasi-periodic” systems, 
i.e., x and p always have to stay finite. (5 P) 


Problem 6.21 Consider the 1D diffusion equation dy/dt = D d*y/dx*. How do 
its solutions read with the initial condition y(0, x) = f (x) instead of the boundary 
condition of Problem 6.12? (2 P) 


Problem 6.22 The gas pressure p on the walls can be determined from the momen- 
tum change due to the elastic collision of the molecules. Determine the pressure as a 
function of the average energy of the individual molecules. Here the same assump- 
tions are made as for the derivation of the Boltzmann equation. Do we need the 
Maxwell distribution? What follows for (E) if the ideal gas equation pV = NkT 
holds? (6 P) 


Problem 6.23 Ina galvanometer, a quartz fiber with the torque ô = 10~'° J supports 
a plane mirror. How large is the directional uncertainty at 20°C from the Brownian 
motion of the air molecules? How much does a reflected light beam fluctuate on a 
target scale at 1 m distance? (3 P) 


Problem 6.24 For an ideal monatomic gas, p V5⁄ is a constant for isentropic pro- 
cesses. How much does the internal energy U change if the volume increases from 
Vo to V? Does U increase or decrease? (3 P) 


Problem 6.25 Consider a cycle in an (S, T) diagram. What area corresponds to the 
usable work and what area to the heat energy input? Consider a heat engine with 
the heat input Q+ = T, AS; at the temperature T} and Qo = ToA Sz at To < T}, 
as well as heat output Q_ = T_ (AS; + AS) at T_ < Ty. Determine the efficiency 
n(Q+, Qo, Q_) and compare it with the efficiency of an ideal Carnot process (nc 
with Qo = 0). Express the result as a function of nc, Qo/Q+, and To/ T}. Determine 
a least upper bound for the efficiency of a cycle process with heat reservoirs at several 
input and output temperatures. (9 P) 


616 6 Thermodynamics and Statistics 


Problem 6.26 Why do we have to do work to pump heat from a cold to a hot 
medium? Investigate this with an ideal cycle. Under ideal constraints, let the work 
A be necessary in order to keep a house at the temperature T, inside, while the 
temperature outside is T_. How are these three quantities connected with the heat 
loss Q+? How is the input heat Q’, in an ideal power plant related to the heat loss Q+ 
considered above if it works between the temperatures T; and T” ? Neglect the losses 
in the power plant that delivers the electric energy. Take as an example T; = 800°C, 
T, = 20°C, and T_ = T! =0°C. (8 P) 


Problem 6.27 Determine the functional determinant 

a(S, T) as oT as oT 

sv = (av), (Gp), (=), (av), Op) 
Problem 6.28 Express the derivatives of S with respect to T, V, and p, with the other 
parameters kept fixed, in terms of the thermal coefficients and V and T. Express the 


derivatives of T with respect to S, V, and p in terms of the quantities above. Express 
(OF /dT), and (dG/dT )y in terms of these quantities. (6 P) 


Problem 6.29 Are (d°U/0S)y, (0°?U/dV7)s, (0°G/8T7),, and (d?G/dp?)r 
always positive? (4 P) 


Problem 6.30 If a charge dg is inserted isothermally and isochorically into a 
reversibly working galvanic element at the open circuit voltage ®, the work 5A = 
® dq is done. How does its internal energy change for given ®(T)? 


Hint: Note the integrability condition for the free energy F. In addition, we should 
have 5A = gdQ, if upper-case letters always stand for extensive quantities and lower- 
case letters for intensive quantities. (4 P) 


Problem 6.31 What vapor pressure p(T) is obtained from the Clausius—Clapeyron 
equation if we assume a constant transition heat Q, neglecting the volume of the 
liquid compared to the volume of the gas, and using the equation pV = NkT for an 
ideal gas? (4 P) 


Problem 6.32 One liter of water at 20°C and normal pressure (1013 hPa) is subject 
to a pressure twenty times the normal pressure. Here the compressibility is 0.5/GPa 
on average and the expansion coefficient 2 x 10~4/K. Determine V/ Vo as a function 
of p and po (give values in numbers as well). How much work is necessary for the 
change of state? By how much does the internal energy change? (6 P) 


Problems 617 


Problem 6.33 At the freezing temperature, ice has the density 0.918 g/cm? and 
water the density 0.99984 g/cm?. An energy of 6.007 kJ/mole is needed to melt ice. 
How large are the discontinuities in the four thermodynamic potentials for this phase 
transition (relative to one mole)? (4 P) 


Problem 6.34 What is the connection between (0U/dV)7 and (a2 /dT)y? Can 
(0Cy/0V)r be uniquely determined for a given thermal equation of state? Transfer 
the results to the enthalpy and Cp. (6 P) 


Problem 6.35 For a given heat capacity Cy (T, V) and thermal equation of state, 
is the entropy uniquely defined? Can we then also determine the thermodynamic 
potentials? (4 P) 


Problem 6.36 From thermal coefficients for ideal gases, derive the relation 
pV“ = const. , 


for isentropic processes. Determine V (T) and p(T) for adiabatic changes in ideal 
gases. How does the sound velocity c in an ideal gas depend on T, and what is 
obtained for nitrogen at 290 K? (6 P) 


Problem 6.37 For a mole of “He at 1 bar and 290 K, determine the thermal de 
Broglie wavelength à, the fugacity exp(u/kT), the free enthalpy (in J), and the 
entropy (in J/K). Here, helium may be taken as an ideal gas. (4 P) 


Problem 6.38 How is the thermal equation of state for ideal monatomic gases to 
be modified in order to account to first order for the difference in In Zgc between 
bosons and fermions? 


Hint: We may expand pV/kT in powers of the fugacity and express this in terms of 
N,V, and xX. 


Compare the pressures of the Bose and Fermi gases with that of a classical gas. (8 P) 


Problem 6.39 How do the pressure and temperature of the air depend on the height 
for constant gravitational acceleration if heat conduction is negligible compared to 
convection and therefore each mass element keeps its entropy? This is more realistic 
than the assumption of constant temperature. (2 P) 


Problem 6.40 Consider the heating of a house as an isobaric—isochoric situation: 
the air expands with increasing temperature and escapes through leakages. Assuming 
an ideal gas, how does the number of molecules in the house change, and how does 
the internal energy change, assuming that there are no internal excitations of the 
molecules? Does the entropy increase or decrease. Or is this clear anyway from the 
entropy law? (Heating is not an energy problem, but an entropy problem!) (5 P) 


618 6 Thermodynamics and Statistics 


Fig. 6.31 Diesel cycle. 2 3 

Idealized cycle from 1 to 2 p 

and from 3 to 4 along 

isentropic (adiabatic) curves 

of an ideal gas, between 

either isobaric (2 — 3) or 

isochoric (4 — 1) curves. 4 
Contrast with twice isochoric 

for the Otto cycle and twice 

isobaric for the Joule cycle 1 
(gas turbine) 


Problem 6.41 To extend a surface by dA, work èW = ø dA has to be done against 
the attraction between the molecules, where ø is the surface tension. What sign 
does (d0/0T),4 have? How does the free energy change for an isothermal surface 
(without volume change) and how does the internal energy change? How much heat 
is involved in an isothermal surface extension assuming that o (T, A) is given? (6 P) 


Problem 6.42 For four-stroke engines (suction, compression, combustion, ejec- 
tion), only two cycles are assumed to be idealized. For example, Fig. 6.31 shows 
the diesel cycle. Note that diesel engines are “compression—ignition engines”: the 
fuel burns at approximately constant pressure. Which two cycles are related to the 
diesel cycle (why?), and which path is taken by the one and the other in Fig. 6.31? 
What is the efficiency of the idealized diesel engine as a function of the compres- 
sion K = V,/V> and expansion E = V3/ V3, assuming a single ideal diatomic gas, 
i.e., assuming the air to be pure nitrogen? Note that, clearly, K > E > 1. Begin by 
expressing Q+ in terms of the relevant temperatures. The compression depends on 
the construction, but the expansion does not. It is determined by the “heat of com- 
bustion” (combustion enthalpy). Determine the ratio K/E of the enthalpies. (9 P) 


List of Symbols 


We stick closely to the recommendations of the International Union of Pure and 
Applied Physics (IUPAP) and the Deutsches Institut fiir Normung (DIN). These 
are listed in Symbole, Einheiten und Nomenklatur in der Physik (Physik-Verlag, 
Weinheim 1980) and are marked here with an asterisk. However, one and the same 
symbol may represent different quantities in different branches of physics. Therefore, 
we have to divide the list of symbols into different parts (Table 6.3). 


List of Symbols 


Table 6.3 Symbols used in thermodynamics and statistics 


619 


Symbol Name Page number 
* Q Amount of heat 513 
* A Work 513 
* V Volume 9 
* p Pressure 560 
* N Particle number 552 
* H Chemical potential 560 
* S Entropy 523 
* T Temperature 558 
* U Internal energy 556 
* F =U —T S | (Helmholtz) Free energy 567 
* H = Enthalpy 567 
U+pV 
* G= (Gibbs) Free enthalpy 567 
H-TS 
J = F — uN | Grand canonical potential 567 
* = v (Volume-) Expansion coefficient 569 
z (7), 
dp : 
* B= ( ) Pressure coefficient 569 
oT /v 
* Cp= Isobaric heat capacity 569 
(ir) 
ƏT’ p 
* Cy = Isochoric heat capacity 569 
as 
r (3r) 
ƏT/v 
y KT = Isothermal compressibility 569 
1 ( aV ) 
V \dp/r 
Ks = Adiabatic compressibility 569 
1 ( ƏV ) 
V \op/s 
* c Sound velocity 570 
Pz Probability for the state z 515 
Q Partition function up to limiting energy 525, 550 
Z Partition function 549, 556 
xP Zc Canonical partition function 554 
xP ZMC Micro-canonical partition function 549 
xP Zac Macro-canonical partition function 5535 


(continued) 


620 6 Thermodynamics and Statistics 


Table 6.3 (continued) 


Symbol Name Page number 
* T Relaxation time 527 
* Boltzmann constant 623 
* Na Avogadro constant 623 
* R Gas constant 572 
* v Stoichiometric coefficient 561 


“For this compressibility, the abbreviation x is recommended. However, we also use it for the 
isentropic exponent —(V/p) (0p/0V)s5 = 1/(pks). For an ideal gas it is equal to the ratio kr /ks = 
Cp/ Cy 

>The abbreviations “C”, “MCC”, and “GC” stand for canonical, micro-canonical, and grand canon- 
ical, and we also use them for the probabilities pc, pyc, and pgc 


References 


1. C. Caratheodory, Sitzungsber. Preu. Akad. 33 (3 July 1919) 

2. A. Sommerfeld, Lectures on Theoretical Physics 5-Thermodynamics and Statistical Mechanics 
(Academic, London-Elsevier, Amsterdam, 1964) 

3. Ch. Kittel, H. Krämer, Thermal Physics, 2nd edn. (W.H. Freeman, San Francisco, 1980) 

4. F. Reif, Fundamentals of Statistical and Thermal Physics (McGraw-Hill, New York NY, 1965— 
Waveland Press, Long Grove, 2010) 

5. R. Zurmiihl, Matrizen (Springer, Berlin, 1964). in German 

6. C. Syros, The linear Boltzmann equation properties and solutions. Phys. Rep. 45, 211-300 
(1978) 

7. H. Risken, The Fokker-Planck Equation (Springer, Berlin, 1989) 

8. J.R. Rumble (Ed.), CRC Handbook of Chemistry and Physics, 98th edn. (CRC Press, Taylor 
& Francis, London, 2017) 

9. A. Sommerfeld, Lectures on Theoretical Physics 6—Partial Differential Equations in Physics 
(Academic, London-Elsevier, Amsterdam, 1964) 

10. A. Bohr, B.R. Mottelson, Nuclear structure, Vol. 1 (Benjamin 1969—World Scientific 1998) 


Suggestions for Textbooks and Further Reading 


11. R. Baierlein, Thermal Physics (Cambridge University Press, Cambridge, 1999) 

12. S.J. Blundell, K.M. Blundell, Concepts in Thermal Physics, 2nd edn. (Oxford University Press, 
Oxford, 2010) 

13. N.N. Bogolubov, N.N. Bogoluboy Jr., Introduction to Quantum Statistical Mechanics (World 
Scientific, Singapore, 1982) 

14. W. Greiner, L. Heise, H. Stécker, Thermodynamics and Statistical Mechanics (Springer, New 
York, 1995) 

15. L.P. Kadanov, G. Baym, Quantum Statistical Mechanics (Benjamin, New York, 1982) 

16. D. Kondepudi, Introduction to Modern Thermodynamics (Wiley, Chichester, 2008) 

17. L.D. Landau, E.M. Lifshitz, Course of Theoretical Physics Vol. 5—Statistical Physics 3rd edn., 
(Butterworth-Heinemann, Oxford, 1980) 

18. E.M. Lifshitz, L.P. Pitaevskii, Course of Theoretical Physics Vol. 9—Statistical Physics Part 2— 
Theory of the Condensed State (Butterworth-Heinemann, Oxford, 1980) 

19. W. Nolting, Theoretical Physics 5-Thermodynamics (Springer, Berlin, 2017) 


References 621 


20. B.N. Roy, Fundamentals of Classical and Statistical Thermodynamics (Wiley, Chichester, 
2002) 

21. F. Scheck, Statistical Theory of Heat (Springer, Berlin, 2016) 

22. D.V. Schroeder, An Introduction to Thermal Physics (Addison-Wesley, San Francisco, 2000) 


Appendix A 
Important Constants 


This appendix contains four tables. Table A.1 gives the names for different pow- 
ers of 10, Tables A.2 and A.3 give some important constants, and Table A.4 gives 
some derived quantities. The generally accepted CODATA values are taken from 
http://www. physics.nist.gov/cuu/Constants/Table/allascii.txt 

Energy conversion units: J = kgm?/s* = Nm = Ws = VAs = VC=A Wb= Pam’. 


Table A.1 Terminology for powers of 10 


Factor Prefix Abbreviation | Factor Prefix Abbreviation 
107! deci d 10+! deca da 
107? centi c 10+? hecto h 
107° milli m 10° kilo k 
1076 micro u 10+6 mega M 
107° nano n 10+° giga G 
107” pico p 10+!? tera T 
107" femto f 1ot!> peta P 
10718 atto a 10+18 exa E 
© Springer Nature Switzerland AG 2018 623 


A. Lindner and D. Strauch, A Complete Course 
on Theoretical Physics, Undergraduate Lecture Notes in Physics, 
https://doi.org/10.1007/978-3-030-04360-5 


624 


Appendix A: Important Constants 


Table A.2 Important constants in vacuum by choice of the units (m, A). The mass unit (like the 
units of meter, second, and ampere) is expected to be a quantity defined by independent elementary 
quantities in the near future as from May 20, 2019 on 


Quantity Symbol Value Unit 
Light velocity co 299,792,458 m/s 
Magnetic field constant | uo 4r x 1077 N/A? 
12.566370614359 x 1077 | N/A? = H/m 
Electric field constant | ¢9 = 1/ uoco? 8.854187817622 x 107!? | F/m 
Elementary charge e 1.602176634 x 107!° C 
Planck constant h 6.62607015 x 10734 Js 
Action quantum h=h/2n 1.054571818... x 107%4 | Js 
Boltzmann constant k 1.380649 x 10723 J/K 
Avogadro constant Na 6.02214076 x 1073 1/mol 
Atomic mass constant |u 1.66053922... x 1077 kg 
Table A.3 Further constants 
Quantity Symbol Value Unit 
Gravitational constant G 6.67408(31) x 107!! m3/kg s? 
Electron mass Me 9.10938356(16) x 1073! kg 
5.48579909070(16) x 1074 Ju 
Proton mass Mp 1.672621898(21) x 10727 kg 
1.007276466789(91) u 
Neutron mass mn 1.674927471(21) x 10727 kg 
1.00866491588(49) u 
Table A.4 Derived quantities 
Quantity Symbol Value Unit 
Fine structure constant a= pocoe” /2h 7.2973525664(17) x 10-3 
= 1/137.0359991...4 
Bohr magneton uB = eh/2me 9.2740089994(57) x 10-74 | J/T 
Stefan—Boltzmann constant | o = 27k*/60hi7c, | 5.670367(13) x 1078 W/m? K4 


Index 


A 
A (ampere), 164, 200, 623 
Aberration, 236 
Absorption circuit, 214 
Absorption, forced, 482 
Acceleration field, 257 
Action 
action function, 135-140, 245 
action variable (phase integral), 136 
reduced, 136-141 
Action principle, 140 
Action quantum, 276, 341, 524, 624 
Active resistance, 213 
Activity, absolute (fugacity), 582 
Addition law for velocities, 234 
Addition theorem for spherical harmonics, 
400 
Adiabatic theorem, 296 
Aggregation state (phase), 572 
Alloy, 574 
Amount of heat, 563-565 
Ampére’s circuital law, 195 
Angular frequency, 137 
Angular momentum, 70, 100 
conservation, 77 
coupling, 335-337 
of the radiation field, 464 
of two particles, 72 
operator, 328-329 
rigid body, 86 
Annihilation operator, 330-331, 470 
for bosons, 302, 440-443 
for fermions, 438—440, 442-443 
Anomaly, magneto-mechanical, 327 
Anti-correlation, 520 


© Springer Nature Switzerland AG 2018 


A. Lindner and D. Strauch, A Complete Course 


Anti-normal order, 477 
Anti-particle, 501 
Approximation 

adiabatic, 347 

Born, 405 

better (DWBA), 420 

Area—velocity law, 64, 142 
Atomic mass constant, 571, 624 
Atomic model, Bohr’s, 367 
Attractor, 107 
Auto-correlation, 520 
Avogadro constant, 572, 624 
Azimuth, 30, 31 


B 
Balance equation, 526 
Base vector, 31—33 
contravariant, 32 
covariant, 32 
Basic relation of thermodynamics, 561-562 
irreversible, 562, 576 
BCS theory, 457—462 
Beats, 115 
Behavior, critical, 603—605 
Bernoulli distribution, 517 
Bernoulli equation, 574 
Bessel function 
integer, 480 
spherical (half integer), 400 
Bi-orthogonal system, 426 
Binomial coefficient, 365, 614 
Binomial distribution, 516-518 
Binormal vector, 7 
Biot—Savart law, 193 


625 


on Theoretical Physics, Undergraduate Lecture Notes in Physics, 


https://doi.org/10.1007/978-3-030-04360-5 


626 


Bit, 521 
Bloch equation, 486 
Bloch function, 116 
Bloch vector, 313, 343 
Blue sky, 264 
Body 

massive, 79 

rigid, 85-90, 

Euler equations, 92 

Bogoliubov transformation 

for bosons, 473-475 

for fermions, 458 
Bohr magneton, 327, 624 
Bohr radius, 362 
Bohr-Sommerfeld quantization, 136 
Boiling point, raising, 574 
Boltzmann constant, 523, 624 
Boltzmann equation, 531-533 

collision-free, 345, 530 
Boltzmann statistics, corrected, 578 
Bose-Einstein condensation, 608-611 
Bose-Einstein statistics, 578-582 
Bosons, 435-438, 440-442 
Boundary condition 

asymptotic, 424 

periodic, 452—453 
Boundary conditions (conductor/insulator), 

225 

electrostatics, 177 
Box potential, 354—358 
Bra-vector, 283 
Braking radiation, 265 
Breit—Wigner formula, 426-427 
Brewster angle, 223 
Brillouin function, 606 


C 
Capacitor 

cylindrical, 179 

plate, 180 

spherical, 179 
Capacity, 179-180 
Cauchy—Riemann equations, 177 
Cauchy sequence, 284 
C (coulomb), 164, 623 
Center-of-mass law, 70-71 
Central field, 142—144 
Central force, 55 
Centrifugal force, 91 
Centrifugal potential, 142 
Change of representation, 286 
Change of state 


Index 


adiabatic, 566 

irreversible, 576 

reversible, 558 
Channel 

closed, 424 

open, 423 
Channel Hamilton operator, 429 
Channel radius, 424 
Channel resolvent, 429 
Chaos, molecular, 532 
Characteristic equation of an eigenvalue 

problem, 88, 114 

Characteristic function 

(anti)normal-ordered, 479 

(reduced action), 135-141 
Charge 

apparent, 174 

electric, 165-166 
Charge conjugation, 500-501 
Charge density, 166 
Christoffel symbols, 41—42 
Circuit, oscillating, 213-214 
Circular orbit, 67 
Circulation voltage, 206 
Clausius—Clapeyron equation, 573 
Clausius—Mosotti formula, 175 
Clebsch—Gordan coefficient, 337 
Clifford algebra, 490 
Coefficient 

stoichiometric, 561 

thermal, 568—571 
Coexistence curve, 573 
Coherence, 312 
Collapse of the wave function, 389 
Collision integral, 533 
Collision, inverse, 531 
Collision law, 73-76 
Collision parameter, 67 
Column vector, 3 
Commutation relation, 315-317 
Compass needle, 102 
Completeness relation, 285 
Compressibility, 569-571 
Conduction electrons, 588 
Conductivity, electric, 187 
Configuration mixture, 456 
Configuration space, 59 
Conservation law, 238 

of angular momentum, 77 

of charge, 186, 204 

of energy, 78 

of momentum, 69 
Conserved quantity, 69, 77—79 


Index 


Constant of the motion, 101 
Constraint, 94—95 
bilateral, 94 
holonomic (integrable), 94 
rheonomous, 94 
scleronomous, 94 
unilateral, 94 
Contact voltage, 178 
Continuity equation, 187 
Continuum, normalization in the, 287 
Convolution integral, 22 
Coordinate 
Cartesian, 3 
curvilinear, 31—44 
cyclic, 99 
general, 31-44 
generalized, 59-62 
oblique, 31-44 
Coordinate transformation, 33—34 
Core electrons, 362 
Coriolis force, 91 
Correlation, 520-521 
Correlation coefficient, 520 
Correlation function, 534 
Correspondence principle, 325-327 
Coulomb force, 410 
Coulomb gauge, 197, 210 
Coulomb law, 165-169 
Coulomb parameter, 422 
Coulomb scattering amplitude, 422 
Coulomb scattering phase, 422 
Coulomb wave functions, 422 
Counter-force, 55 
Coupling of angular momenta, 335-337 
Covariant, bilinear, 498 
CPT theorem, 500 
Creation operator, 330-331, 470 
for bosons, 302, 440-443 
for fermions, 438—440, 442-443 
Curie law, 606 
Curie temperature, 607 
Curie—Weiss law, 608 
Curl, 13-14 
Curl density, 13—14 
Current 
electric, 186-189 
quasi-static, 205 
stationary, 187 
Current density, 186, 348-350 
Current strength, 186 
Curvature, 7—9 
second, 8 
Cycle, Diesel/Otto/Joule, 618 


Cycle process, 563-565, 615 
Carnot, 564 

Cyclotron frequency, 78, 189 

Cylindrical capacitor, 179 

Cylindrical coordinates, 40 

Cylindrical symmetry, 40 


D 
D’ Alembert operator, 239 
Damping, aperiodic, 108 
Darboux vector, 8 
De Broglie relation, 319 
De Broglie wavelength, thermal, 583 
Debye function, 597-598 
Debye temperature, 598 
Decay coefficient, 106 
Decay length, 224 
Decay, radioactive, 528 
Decay time, 106, 527 
Decoherence, 389 
Decoupling, 114 
Degeneracy, 114, 295 
accidental, 355 
Degrees of freedom, 59 
frozen, 522, 560 
of a system, 374 
Delta function, 18—22 
transverse, 469-470 
Density of states, 550-552 
Density operator, 311-313 
reduced, 375-389 
time dependence, 342-344 
Derivative 
covariant, 42 
partial, 11 
Determinant, 5 
Detuning, 483-486 
Deviation, 87 
Deviation, average (square), 516 
Diamagnet, 196 
Dielectric constant (permittivity), 176 
Diesel cycle, 618 
Differential equation 
Euler’s, 140 
Hill’s, 116-120 
Mathieu’s, 117—118 
Differential, exact (complete, total), 565 
Differential quotient, partial, 43-44 
Diffraction law 
for force lines, 177 
Snellius, 221 
Diffusion coefficient, 543 


627 


628 


Diffusion equation, 526, 536-537 
improved, 536 
Dipole moment 
electric, 171 
magnetic, 190-192 
Dipole radiation, 264 
Dirac bracket, 282-283 
Dirac equation, 489 
adjoint, 497 
Dirac matrix, 490-494 
Dirac picture, 345-348 
Direct term, 445 
Dispersion, 221 
squared fluctuation, 516 
Dispersion relation, Kramers—Kronig, 23— 
24 
Displacement current, Maxwell’s, 164, 204, 
205 
Displacement, electric, 174-176 
Displacement field 
electric, 174-176 
magnetic, 193-195 
Displacement law (Wien’s), 596 
Displacement operator, 317 
for Glauber state, 471 
Dissipation, 374—389 
Dissipation function (Rayleigh’s), 99 
Dissipative behavior, 541 
Distribution (generalized function), 18 
Divergence (source density), 11—12 
in general coordinates, 38—41 
Doppler effect, 236, 264 
quadratic, 236 
transverse, 236 
Double factorial, 401 
Double slit experiment, 280-281 
Doublet (two-level system), 308-310, 368 
density operator, 312 
Drag coefficient, 84, 236 
Drift term, 542 
Dulong-Petit law, 597 


E 
Eccentricity of an ellipse, 63 
Efficiency, thermal, 565 
Ehrenfest’s theorem, 339 
Eigen angular momentum, 324-325 
Eigen-representation, 295 
Eigenvalue, 87—90, 294-296 
Eigenvalue equation 
for the angular momentum, 329 
for the energy, 351-374 


Index 


Eigenvalue problem, 113-114 
Eigenvector, 87—90, 294-296 
Eikonal, 137 
Electron, outer, 362 
Elementary charge, 165-166, 624 
Ellipse, 63 
Elliptic functions (Jacobi) 
amplitude, 105-106 
cosinus amplitudinis, 146 
delta amplitudinis, 146 
sinus amplitudinis, 105, 146 
Elliptic integral, 103—106 
complete 
first kind, 104—105 
third kind, 149 
incomplete 
first kind, 103-105, 203 
third kind, 148-149 
Emission 
forced, 482 
spontaneous, 380, 485—486 
Energy 
bound, 575 
free, 567, 575 
of the electric field, 182 
internal, 513, 556 
kinetic, 70 
for time-dependent force, 151 
of two bodies, 72 
rigid body, 86 
potential, 56-58, 151 
generalized, 97-99 
of dipoles, 171-172, 198 
Energy conservation law, 78 
Energy density 
of the electric field, 182 
of the magnetic field, 211 
Energy flux density, 211 
Energy gap, 461 
Energy—momentum stress tensor, 248—249 
Energy representation, 417 
Ensemble, 515 
canonical, 554 
ergodic, 534 
grand canonical, 555 
generalized, 556-561 
micro-canonical, 549-550 
statistical, 279, 515-520 
Enthalpy, 567, 574-575 
free, 567, 572 
Entropy, 514 
Entropy law, 514, 525-546 
Entropy maximum, 552-561 


Index 


Equation, cubic, 604 
Equation of state 
canonical, 576 
thermal, 576, 582 
Equidistribution law, 559 
Equilibrium 
chemical, 560 
detailed, 527 
inhibited (partial), 557 
thermal (thermodynamic, statistical), 
548 
total, 557 
Equilibrium constant, 588 
Equilibrium distribution, 546-561 
Equilibrium state, stable, 576 
Error analysis, 50-51 
Error (average), 46-52 
of the single measurement, 50 
Error distribution, 47—49 
Error integral, 48 
Error limits, 44-52 
Error propagation, 49 
Error width, 516 
Euler angles, 30-31 
Euler-Lagrange equations 
generalized, 241 
Euler’s curvature radius, 7 
Euler’s theorem for homogeneous functions, 
587 
Eutecticum, 574 
Event, 228 
Exchange equilibrium, 557-561 
Exchange hole, 453 
Exchange symmetry, 434-436 
Exchange term, 445 
Excitation, magnetic, 193—195 
Expansion 
in terms of Glauber states, 476-478 
in terms of Legendre polynomials, 181 
in terms of orthonormal system, 286 
of operators, 297 
plane wave in terms of spherical waves, 
399 
Expansion coefficient, 569-571 
Expectation value, 47, 299 
Exponent, critical, 604 


F 

Factor, integrating (Euler’s), 566 
Faddeev equations, 432 

Faraday induction law, 205 
Fermi—Dirac statistics, 578-582 


629 


Fermi energy, 355, 582 
Fermi gas, degenerate, 588-593 
Fermi gas model, 355 
Fermions, 435—440 
Fermi’s golden rule, 347 
Ferroelectric, 176 
Ferromagnet, 196 
Ferromagnetism, 607—608 
Feshbach theory, 423-426 
F (farad), 164 
Fictitious force, 90-92 
Fictitious resistance, 214 
Field constant 
electric, 165, 166 
magnetic, 164—165, 201 
Field, electromagnetic, 206-227 
Field equations 
electrostatics, 176-178 
magnetostatic, 195-197 
Field-line tube, 12 
Field operator, 301-303 
Field quantization, 278 
Field strength 
electric, 166 
magnetic, 193-195 
Field tensor, electromagnetic, 240-244 
Final-state interaction, 429 
Fine structure constant, 362, 624 
Fizeau experiment, 236 
Floquet operator, 117 
Floquet solution, 117 
Flow, isentropic, 574 
Fluctuation—dissipation theorem, 539-542 
Fluctuation, relative, 516 
Flux, 12 
Fock space, 438 
Fock state, 473 
Fokker—Planck equation, 542-546 
Foldy—Wouthuysen transformation, 503 
Force, 55—62 
generalized, 59-62 
stochastic, 538 
velocity-dependent, 97-99 
Force field, 77 
homogeneous, 57 
Force law, Ampére’s, 200-201 
Force of constraint, 58 
Fourier series, integral, 21 
Fourier transform, 22—25, 216-220 
Four-momentum, canonical conjugate, 247 
Four-potential, 239 
Four-vector, 231—238 
Free-fall laws, 83-85 


630 


Freezing point, lowering, 574, 587 
Frenet—Serret formulas, 8 
Fresnel’s equations, 222 
Friction, 97-99 

Newtonian, 84 

Stokes, 99 
Frictional constant, 538 
Fugacity, 582 
Functional derivative, 251 
Functional matrix, 34 
Function space, Hilbert, 286-287 
Fundamental solution, 116 


G 
Galilean transformation, 227 
T-space, 523 
Gap condition, 461 
Gas 
ideal, 582-586 
real, 599 
Gas constant, 572 
Gauge transformation, 98, 209 
Gauss distribution, 47, 519 
Gauss force, 410 
Gauss’s theorem, 12 
Gay-Lussac law, 582 
Gell-Mann and Goldberger 
two-potential formula, 420 
Gell-Mann matrix, 297 
Generalized function (distribution), 18 
Generating function 
canonical transformations, 130-133 
of the Bessel functions, 480 
of the Hermite polynomials, 360 
of the Laguerre polynomials, 365 
of the Legendre polynomials, 82 
Gerschgorin’s theorem, 528 
G (gauss), 165 
Gibbs—Duhem relation, 572 
generalized, 587 
Glauber state, 471-473 
Golden rule, Fermi’s, 347, 382-386 
Gradient, 10-11 
in general coordinates, 38-41 
Graph 
connected, 432—433 
unconnected, 430 
Gravitation, 79 
Gravitational acceleration, 81—85 
Gravitational constant, 624 
Gravitational force, 55, 79 
Green function, 111 


Index 


of the Laplace operator, 27 
of the time-dependent oscillator, 119 
propagator, 406 

Green theorems, 17 

Group velocity, 354 


H 
Hamilton equations 
canonical, 122 
for a field, 252 
Hamilton function, 122—124 
Hamilton-Jacobi theory, 135-138 
Hamilton operator, 326, 351-374 
effective, 424 
non-Hermitian, 381—382 
Hankel function, 401 
Hartree—Fock—Bogoliubov equations, 459— 
462 
Hartree-Fock equations, 454-455 
Heat, 563-565 
Joule, 188 
latent, 563 
specific, 569 
Heat capacity, 569-571 
Heat tone, 73 
Heisenberg equation, 339-340 
Heisenberg picture, 340-341 
Heisenberg’s uncertainty relation, 275-276 
Helicity, 219, 325 
Hellmann—Feynman theorem, 296 
Hermite polynomial, 359-361 
Herpolhode cone, 90 
H (henry), 164 
Hilbert space, 282-284 
convergence in, 283-284 
Hilbert vector, 282-287 
improper, 287 
orthogonal, 283 
parallel, 283 
Hill’s differential equation, 149 
Hole operator, 462 
Hooke’s law, 52 
H-theorem, 525 
Husimi function, 479 
Hydrogen atom, 361-367 
Hyperbolic orbit, 67 
Hysteresis curve, 196 


I 

Identity 
Euler’s, 101 
Jacobi 


Index 


for commutators, 289 
for Poisson brackets, 124 
for vector products, 4 
Image charge, 180-181 
Impedance, 213 
Impulse, 77 
Induced charge, 180 
Inductance, 201—203 
Induction, magnetic, 193-195 
Induction voltage, 206 
Inequality 
Bessel’s, 285 
Schwarz, 283 
Inertial ellipsoid, 89 
Inertial force, 90 
Inertial frame, 69 
Inertial law, 69 
Information entropy, 521-523 
Insertion of intermediate states, 285 
Insulator, 176-178 
Integrability condition 
Maxwell’s, 554, 568 
Integral principles, 139-142 


Integral theorems for vector expressions, 16— 


17 
Interaction 
average, 449 
magnetic, 198-201 
non-local, 425 
separable, 425 
time-dependent, 345-348 
Interaction representation, 345-348 
Interface, and vector field, 27 
Inversion curve, 575 
Ising model, 607 
Isotropy, 40 


J 

Jacobi coordinates, 71 

Jacobi matrix, 34 
Jaynes—Cummings model, 482-486 
J Goule), 623 

Joule cycle, 618 

Joule-Thomson effect, 575 


K 
Kepler problem, 62-68 
Kepler’s law 

first, 63 

second, 64 

third, 65 
Ket-vector, 283 


Kirchhoff’s lawn, 189 
Klein—Gordon equation, 501 
Koopman’s theorem, 455 


631 


Kramers—Kronig (dispersion) relation, 23— 


24 
Kramers—Moyal expansion, 543-544 
Kramers theorem, 314 
Kronecker symbol, 18 


L 
Ladder operator, 330 
Lagrange density, 241 
Lagrange equations 
first kind, 61-62 
second kind, 95-99 
Lagrange function, 96—100, 247 
generalized, 97 
Lagrangian multiplier, 61 
Laguerre polynomial, 365 
generalized, 364-366 
Lamb shift, 380 
Landau levels, 359 
Landé factor, 373 
Langevin equation, 537-539 
generalized, 542 
Laplace equation, 16, 176-177 
Laplace operator, 15 
Laplace transform, 110-111 
Larmor formula, 263 
Larmor precession, 343-344 
Lattice oscillation, 596-598 
Lattice vector, 31 
reciprocal, 31 
Lattice vibration (phonon), 359-361 
Law of mass action, 588 
Law of motion, Newton’s, 76 


Legendre polynomial, 81-83, 333-335 


Legendre transformation, 121, 567 
Leibniz formula, 364 

Lenz’s rule, 205 

Level repulsion, 310 

Level shift, 425 

Level splitting, 371-373 

Level width, 425 

Lever law, 59 

Levi-Civita tensor, 36 

Levinson theorem, 421 

Libration, 103 

Lie algebra, 290 
Liénard—Wiechert potential, 260-261 
Lifetime (average), 426, 527 
Light cone, 230 


632 


Light quantum (photon), 466-470 

Line integral, 9 

Line of nodes, 30, 31 

Line width, natural, 264 

Liouville equation, 125, 129, 343, 381-389, 

529 
Lippmann-Schwinger equation, 406, 411- 
413 

Lorentz contraction, 229, 230 

Lorentz distribution, 47, 425, 519 

Lorentz force, 78, 189-190, 244 

Lorentz gauge, 210 

Lorentz group, extended, 228 

Lorentz invariance, 216 

Lorentz transformation 
homogeneous, 228-231 
improper, 228 
inhomogeneous, 227, 228 
orthochronous, 228, 496 
proper, 228, 254 

Loschmidt number, 571 

Low equation, 415 


M 
Macro state, 515 
Magnetization, 191 
Magnetization current, 242 
Magneton, Bohr, 191, 327, 624 
Magnetostatics, 193-199 
Main theorem 

first, 513, 564 

second, 514, 564 

third, 514, 559 

zeroth, 513, 558 
Many-body state, 433-438 
Markov approximation, 379 
Mass 

inertial, 69 

reduced, 72 

relativistic, 245 
Mass unit, atomic, 571, 624 
Master equation, 526 
Matrix, 5 
Matrix element, 290 

reduced, 385 
Matrix mechanics, 287 
Maxwell—Boltzmann statistics, 577 
Maxwell distribution, 546-548 

local, 547 
Maxwell equations 

macroscopic, 206-208 

covariance, 241—244 


Index 


microscopic 
covariance, 239-241 
Maxwell relations, 554 
Maxwell’s 
construction (field lines), 167 
construction (van der Waals), 600 
Mean square fluctuation, 47 
Mean value, 46 
over time, 79 
Measurable quantity, 298 
Measurement process, 374 
Meissner—Ochsenfeld effect, 195 
Melting, 573 
Melting heat, 563 
Method of least squares, 51-52 
Metric, Hermitian, 282 
Metric tensor, 36 
Micro state, 515 
Minkowski diagram, 231 
Minkowski force, 248 
Minkowski metric, 232 
Mixing entropy, 574, 586 
Mixture 
of materials, 586-588 
of states, 280, 311-313 
complete, 312 
Mole, 571 
Molecular field approximation, 607 
Molecular motion, Brownian, 537 
Moment, magnetic, 190-192 
Moment of inertia, 86—90 
Momentum, 69-70 
canonical conjugate, 99-101 
mechanical, 100 
of two bodies, 72 
Momentum conservation law, 69 
Momentum density of the radiation field, 
215 
Momentum representation, 317—323, 417 
Monopole (charge distribution), 171 
Motional quantity (momentum), 69 
Motion, force-free, 69-73 
Multipole moment, 171, 181 
u-space, 523 
Mutual inductance, 201—203 


N 

Nabla, 10 

Negative-frequency part, 469 
Neumann formula (inductance), 201 
Neumann function, 401 

Newton’s axiom 


Index 


first, 69 

second, 76 

third, 55 
N (newton), 623 
Normal acceleration, 7 
Normal coordinates, 113—115 
Normal distribution, 47, 519 
Normalizable (function, state), 286 
Normal order, 477 
Normal stress (pressure/tension), 183 
Normal vector, 7 
Norm (length of a Hilbert vector), 283-284 
Nutation, 89 


(0) 
Observable, 298—299 
Occupation number, average, 580 
Occupation-number representation, 440, 
578 
Oe (oersted), 165 
Ohm’s law, 187 
for AC current, 213 
Q (ohm), 164 
One-particle density operator, 445 
One-particle state, 433 
Opalescence, critical, 604 
Operator, 288-315 
adjoint, 292 
anti-linear, 289, 313 
commuting, 289 
diagonalization, 295 
expansion, 297 
Hermitian, 292-293 
idempotent, 291 
inverse, 292 
linear, 289-315 
local, 299 
orthogonal, 297 
representation, 290 
self-adjoint, 292-293 
trace, 294 
unitary, 293 
Optical theorem, 418 
Optics, geometrical, 135-138 
Order parameter, 604 
Ornstein—Fiirth relation, 535-537 
@rsted law, 195 
Orthogonal system of the Legendre polyno- 
mials, 82 
Orthonormal set of functions, 21 
Orthonormal system, 284 
Oscillating circuit, 213-214 


633 


Oscillation 
coupled, 112-115 
damped, 106-112 
forced, 108-112 
harmonic, 102 
differential equation, 106 
quantum-mechanical, 358-361 
Oscillator (see also oscillation) 
time-dependent, 116—120, 149-151 
Otto cycle, 618 
Outer electron, 362 
Over-complete basis, 472 


P 
Pair force, 456 
Pa (pascal), 623 


Paradox 
Gibbs’, 578, 586 
Zeno’s, 382 


Paraelectric, 175 
Parallel connection, 189 
Paramagnet, 196 
Paramagnetism, 605—606 
Parameter 
extensive, 552, 571 
intensive, 552, 571 
Parametric amplification, 475 
Parity, 314 
Parity operation, 29, 228 
Parseval’s equation, 23 
Partial system, 520-521 
Particle, free, 353 
Particles, identical, 577 
Partition function, 549-556 
canonical, 554 
Path curvature, 7—9 
Pauli equation, 327, 504 
rate equation, 382 
Pauli operator, 308 
Pauli principle, 303, 435 
Pendulum, 101—106 
Foucault’s, 91 
mathematical, 101 
oscillation period, 104 
spherical, 145-149 
Permeability, 196 
Permittivity (dielectric constant), 176 
Perturbation theory, 134 
of Schrédinger and Rayleigh, 369 
of Wigner and Brillouin, 369 
time-dependent, 346 
time-independent, 368-370 


634 


P-function, 479 
Phase (aggregation states), 572 
Phase convention 
for fermion states, 439 
of Condon and Shortley, 331, 337 
Phase integral, 136 
Phase operator, 304-307 
Phase shift, 102, 109 
Phase space, 121 
larger, 523 
Phase space cell, 523-525 
Phase transition, 572, 599-611 
first order, 603 
second order, 604 
Phase velocity, 137, 225, 354 
Phonon, 359 
Photon, 359, 466-470 
Planck distribution, 595 
Planck’s action quantum, 276, 624 
Plane 
invariant, 89 
reflection and diffraction at, 220—223 
Plane of incidence, 221 
Planetary motion (Kepler problem) 
as two-body problem, 79-80 
Plate capacitor, 180 
Poincaré group, 228 
Poinsot’s construction, 89 
Point, critical, 600 
Poisson bracket, 124-125 
Poisson distribution, 519 
Poisson equation, 27, 169 
Polar distance, 39 
Polarizability of molecules, 175 
Polarization 
electric, 174-176 
for doublets, 312 
magnetic, 191 
Polarization direction, 218—220 
Polhode cone, 90 
Polylogarithm, 593 
Position vector, | 
Positive-frequency part, 469 
Potential, 77 
chemical, 560 
electrostatic, 168—170 
gauge, 169 
grand canonical, 567, 579 
thermodynamic, 566-569 
time-dependent, 208-211 
Power of electric currents, 188 
Poynting’s theorem, 211-213 
Poynting vector, 211-213 


Precession, 90 

pseudo-regular, 148 

regular, 148 
Pressure, 560 
Pressure coefficient, 569-571 
Principal axes, dielectric, 176 
Principal axis transformation, 87-90 
Principal moment of inertia, 87-90 
Principal quantum number, 363 


Index 


Principal theorem of vector analysis, 25-27 


Principal-value integral, 19 
Principle 

Boltzmann’s, 550 

d’ Alembert’s, 93-97 

Fermat’s, 141, 246 

geodesic, 246 

Hamilton’s, 140 

of least action, 141 

of least time, 141 

of virtual work, 58-59 
Probability, 279 

thermodynamic, 550 
Probability wave, 277-279 
Problem, inverse, 62 
Product 

dyadic (tensor product), 11 

inner (scalar product), 3 

of states, 282-283 

of one-particle states, 433 

outer (vector product), 4 
Projection operator, 291 
Propagation of waves 

in conductors, 224—226 

in insulators, 215—220 
Propagator, 369 

energy-dependent, 406—413 

time-dependent, 405 
Proper length, 230 
Proper time, 230 
Pseudo-momentum, 100 
Pseudo-scalar, 6 
Pseudo-vector, 6 


Q 

Q-function, 479 

Quabla, 239 

Quanta, 279 

Quantity 
complementary, 275 
physical, | 

Quantization, 278 
second, 278, 450 


Index 


Quantization direction, 328 
Quantum electrodynamics, 463—487 
Quantum number, 294 

good, 339, 370 
Quantum statistics, 578—582 
Quasi-particles, 458 
Quasi-probability, 324 
Quasi-probability density, 479 
Quasi-static current, 205 
Quenched state, 473-476 


R 
Rabi frequency, 483 
Radial equation, 353 
Radial quantum number, 363 
Radiation constant, 596 
Radiation, electromagnetic, 594-596 
Radiation energy, 258-259 
Radiation field, 256-258 

of a dipole, 261-266 

of a point charge, 260-261 
Radiation formula (Planck), 595 
Radiation gauge, 210, 256 
Radiation pressure, 594 
Radiation source, 253 
Radiative reaction, 264 
Radius, Bohr, 362 
Random walk, 536 
Rapidity, 235 
Rate equation, 382-386, 526 
Ray in Hilbert space, 282 
Rayleigh—Jeans law, 595 
Ray optics, 135-138 
Reactance, 213 
Reaction, endothermic, 588 
Real-space representation, 317—323 
Recursion relation 

for Bessel functions, 400 

for Hermite polynomials, 360 


for Laguerre polynomials, 365-366 


for Legendre polynomials, 82 

for spherical harmonics, 333 
Reference frame, accelerated, 90—92 
Reflectivity of steps, 357 
Refractive index, 137—138, 221 
Relativistic dynamics 

of free particles, 244—246 

with external forces, 247—248 
Relaxation time, 106, 527—529 
Representation 

coupled, 336 

of a Hilbert vector, 285 


uncoupled, 336 

Repulsion of the current, 224, 225 

Residual interaction, 449, 456-457 

Residue theorem, 20 

Resistance, electric, 187 

Resolvent, 406 

Resonance, 425—427, 486 
parametric, 119 

Response function, 539-542 

Rest energy, 245 

Rest mass, 245 

Right-hand rule, 195 

Rodrigues’ formula 
for Hermite polynomials, 359 
for Laguerre polynomials, 364 
for Legendre polynomials, 334 


Rotating-wave approximation, 379, 485 


Rotation, 13 
Rotational energy, 86 
Rotation (curl density), 13—14 

in general coordinates, 38—41 
Rotation matrix, 30-31, 153 
Rotation (vortex density), 13—14 
Row vector, 3 
Rutherford cross-section, 67, 423 
Rydberg energy, 362 
Rydberg state, 362 


S 
Saturation intensity, 486 
Saturation magnetization, 608 
Scalar product, 3 
of states, 282-283 
Scalar (tensor of zeroth rank), 35 
Scalar triple product, 4 
Scattering amplitude, 399—402, 416 
Scattering angle, 67 
Scattering cross-section, 417—418 
Scattering operator, 414—415 
Scattering phase, 421 
Schrödinger equation 
time-dependent, 341 
time-independent, 351-374 
Schrödinger picture, 340-345 
Self-inductance, 212 
Semi-classical ansatz, 485 
Separatrix, 103 
Sequence space, Hilbert, 285 
Series 
Hausdorff, 290 
Neumann, 405 
semi-convergent, 49 


635 


636 


Series connection, 189 
Set of field lines, 9-10 
Shear stress, 183 
Single-particle model, 550-552 
Singlet state, 337 
Skin effect, 225 
Slater determinant, 438 
Sommerfeld parameter, 422 
Sound velocity, 570 
Source density, 11—12 
Space, | 
Space-like interval, 230, 231 
Space reflection, 29, 228 
Spherical capacitor, 179 
Spherical coordinates, 39 
Spherical harmonic, 331-335 
Spin, 324-325 
Spin angular momentum, 324—325 
Spinor, 325 

adjoint, 497 
Spin-orbit coupling, 244, 371-373 
Squared fluctuation, 516 
S (siemens), 164 
Standard deviation, 516 


Standard representation of Dirac matrices, 


492 

State, 565-566 

coherent, 471 

degenerate, 554 

entangled, 375 

pure, 280-281, 311 

quantum-mechanical, 280-281 

stationary, 342 

irreversible change of, 527, 558 
State variable, 513, 563, 565-566 
Static friction, 58 
Statistics, 513-525 

classical, 523-524 


Stefan—Boltzmann constant, 594, 624 


Stefan—Boltzmann equation, 595 
Steiner’s theorem, 86 
Step function (theta function), 18 
Stepwise decay, 429, 529 
Stirling formula, 518 
Stokes’s theorem, 13 
Stress coefficient, 569 
Stress tensor, 183 

Maxwell’s, 184 


Structure constant (Lie algebra), 297 


Sublimation, 573 
Sublimation heat, 563 


Summation convention (Einstein), 33, 231, 


232 


Sum rule, 372 
Superconductor, 188, 195 
Superposition principle, 279-281 
Surface divergence, 27 
Surface element, 9 
Surface rotation, 27 
Surface tension, 183—184 
Susceptibility 
electric, 175-176 
generalized, 539-542 
magnetic, 196, 606 
Synchrotron radiation, 265-266 
System 
closed, 526 
homogeneous, 571-572 
open, 375 


T 
Tangential acceleration, 7 
Tangent vector, 7 
Taylor series, 11 
Telegraph equation, 224 
Temperature, 513, 558 
micro-canonical, 555 
Tension, mechanical, 183 
Tensor, 35—42, 183-184 
totally anti-symmetric, 36 
Tensor contraction, 35 
Tensor extension, 41 
Tensor force, 56, 199-201 
Tensor product, 3 
Theta function (step function), 18 
Throttling experiment, 574-575 
Time, 1 
Time dilation, relativistic, 230 
Time-like interval, 230 
Time-ordering operator, 346 
Time reversal, 228 
Time-shift matrix, 117 
Time shift operator, 340, 403-405 
Top 
force-free, 92, 147 
heavy, 144-149 
Torque, 58 
on dipole, 171—172 
Torsion, 8-9 


Total reflection (limiting angle), 223 


Trace 

of a matrix, 36 

of an operator, 294 
Trajectory, 6-9 
Transformation 


Index 


Index 


canonical, 125—138 
infinitesimal, 129 
infinitesimal, 293 
isometric, 293 
Landen’s (elliptic integrals), 203 
of electromagnetic fields, 243-244 
orthogonal, 29 
unitary, 29, 293-294 
Transition amplitude, 299, 402 
Transition operator, 415-417 
Transition probability, 383 
Transition rate, 383 
Transmittance at steps, 357—358 
Transverse gauge, 210 
Trap circuit, 214 
Triangle inequality, 283 
Triple point, 573 
Triple product, 4 
Triplet state, 337 
T (tesla), 164 
Tunnel effect, 358, 361 
Two-body problem, 79 
Two-body system, 443-445 
Two-by-two matrix 
inverse, 71 
Pauli matrices, 308 
Two-potential formula, 419-420 
2-spinor, 327 


ea 


ncertainty, 50, 516 
quantum-mechanical, 299-301 

Uncertainty relation, 275—276, 525 
particle number—phase, 307 
time-energy, 426 

nit operator, 289 

nit system 
Gauss, 165 
international, 164-165 

Unit vector, 3 

complex, 219 


ad 


v 
Van der Waals equation, 599—605 
Vaporization, 573 
Vaporization enthalpy, 563 
Vaporization heat, 563 
Variable 
conjugate, 122 
natural, 567 
Variance, 47, 516 


Variation, 58 
Variational method, 370 
Vector, 2-28 
axial, 6 
Lenz, 63 
polar, 6 
tensor of first rank, 35 
Vector algebra, 2—6 
Vector field, 9 
interface, 27 
longitudinal, transverse, 25 
Vector potential, 98, 197—198 
gauge, 197 
Vector product, 4 
Vectors 
in function space, 286-287 
in sequence space, 285 
orthogonal, 3 
Velocity field, 257 
Velocity four-vector, 234-236 
Velocity of light in vacuum, 227 
Velocity parameter, 235 
Virial theorem, 79 
Virtual displacement, 58 
Viscosity, 574 
Vlasov equation, 530 
Voltage, electric, 169 
Von Neumann equation, 342-345 
Vortex, 14 
Vortex density, 13-14 
V (volt), 164, 623 


W 
Wave 
evanescent, 223 
polarized 
circularly, 219 
elliptically, 219 
linearly, 219 
propagation in insulators, 215-220 
Wave equation 
homogeneous, 216 
inhomogeneous, 253-256 
solution 
advanced, 254 
retarded, 254 
Wave function, 320-323 
probability amplitude, 279 
Wave mechanics, 287 
Wave operators (Möller’s), 413-414 
Wave packet, 321, 354 
Wave-particle duality, 276-277 


637 


638 Index 


Wave resistance, 222 mechanical, 563—565, 574 
Wave vector, 24, 137 World point, 230 
Wave vector representation, 320, 417 Wronski determinant, 116 
Wb (weber), 164, 623 W (watt), 623 
Weber’s equation, 216 
Weight, 80 
specific, 571 Y 
Weight function, 366 Yukawa force, 410 


Weyl correspondence, 326 
Weyl representation (Dirac matrices), 492 


Wigner—Eckart theorem, 385 Z 
Wigner force, 409 Zero operator, 288 
Wigner function, 321—324 Zero-point energy, 359 
time dependence, 344 Zero vector, 2 
Winding, 8 Zeta function (Riemann), 590-591 


Work, 56 Zitterbewegung, 498, 505 


